Short course on control theory and dynamic programming - Madrid, January 2012

The course provides an introduction to stochastic optimal control theory. The course is in part based on a tutorial given at ICML 2008 and on some selected material from the book Dynamic programming and optimal control by Dimitri Bertsekas.

Course information

One week given at Universidad Autonoma Madrid
Form: Lectures, excercises, practicals
For: Ma students and PhD students
Lecturer: Bert Kappen

Course material:

chapter 1 from the book Dynamic programming and optimal control by Dimitri Bertsekas. Copies 1a Copies 1b Here are his slides for Bertsekas' course.
my ICML 2008 tutorial text will be published in a book Inference and Learning in Dynamical Models (Cambridge University Press 2010), edited by David Barber, Taylan Cemgil and Sylvia Chiappa.
These are the slides that are used for the course. Not all slides are used:
- deterministic control. Bellman equation (9-34)
- PMP principle (35-43,50)
- stochastic processes (51-52)
- Kolmogorov and Fokker-Planckk equations (53-56)
- stochastic optimal control (61-72)
- example portfolio selection (55-59)
- path integral control, delayed choice (83-93)
- path integral expression (98-103)
- sampling methods for computing optimal control (104-105, 111-112)
- acrobot (121-127)
- variational method (132-138) for exercise
Further reading on path integral control theory
- Theodorou et al., AISTATS 2010 Application of path integral control to real robots.
- Mensink et al., ECAI 2010 Application of expectation propagation as an approximate inference methods for a darts throwing control problem.
- van den Broek et al., JAIR 2008 A (model based) sampling approach to control a robot arm.
- van den Broek et al., UAI 2010 Extension of path integral control to risk sensitive control.
- Kappen et al, arxiv:0901.0633 2009 A formulation of path integral control as a special case of KL control and an application of multi-agent coordination.

Schedule:

Date Topic Chapter Presenter Exercises
1 Jan 10 Discrete time control
dynamic programming
Bellman equation
Bertsekas 2-5, 13-14, 18, 21-32 (2nd ed.)
Bertsekas 2-5, 10-12, 16-27, 30-32 (1nd ed.)
Kappen ICML tutorial 1.2
slides up to 34 Bertsekas 1.2 extra exercise 1, 2a,b

2 Jan 11 Continuous time control
Hamilton-Jacobi-Bellman Equation
Pontryagin Minimum Principle
Stochastic differential equations
Stochastic optimal control
LQ examples, Portfolio management
Kappen ICML tutorial 1.3,1.4
slides up to 59 extra exercise 2a,b

3 Jan 12 Path integral control theory
Kappen ICML tutorial 1.5, 1.6, 1.7
slides up to 93 extra exercise 2c, 3
4 Jan 12 Path integral control theory
MC Sampling solution
Numerical examples (particle in a box, N joint arm, Robot learning)
Kappen ICML tutorial 1.7
slides up to 127 extra exercise 4,5 Matlab code for n joint problem
Here is a directory of matlab files, which allows you to run and inspect the variational approximation for the n joint stochastic control problem as discussed in the tutorial text section 1.6.7. Type tar xvf njoints.tar to unpack the directory and simply run file1.m. In file1.m you can select demo1 (3 joint arm) or demo2 (10 joint arm). You can also try larger n but be sure to adjust eta for the smoothing of the variational fixed point equations. You can compare the results with exact cmputation (only recommendable for 2 joints) by setting METHOD='exact'. There is also an implementation of importance sampling (does not work very well) and Metropolis Hastings sampling (works nice, but not as stable as the variational approximation).

	Date	Topic	Chapter	Presenter	Exercises
1	Jan 10	Discrete time control dynamic programming Bellman equation	Bertsekas 2-5, 13-14, 18, 21-32 (2nd ed.) Bertsekas 2-5, 10-12, 16-27, 30-32 (1nd ed.) Kappen ICML tutorial 1.2 slides up to 34		Bertsekas 1.2 extra exercise 1, 2a,b
2	Jan 11	Continuous time control Hamilton-Jacobi-Bellman Equation Pontryagin Minimum Principle Stochastic differential equations Stochastic optimal control LQ examples, Portfolio management	Kappen ICML tutorial 1.3,1.4 slides up to 59		extra exercise 2a,b
3	Jan 12	Path integral control theory	Kappen ICML tutorial 1.5, 1.6, 1.7 slides up to 93		extra exercise 2c, 3
4	Jan 12	Path integral control theory MC Sampling solution Numerical examples (particle in a box, N joint arm, Robot learning)	Kappen ICML tutorial 1.7 slides up to 127		extra exercise 4,5 Matlab code for n joint problem Here is a directory of matlab files, which allows you to run and inspect the variational approximation for the n joint stochastic control problem as discussed in the tutorial text section 1.6.7. Type tar xvf njoints.tar to unpack the directory and simply run file1.m. In file1.m you can select demo1 (3 joint arm) or demo2 (10 joint arm). You can also try larger n but be sure to adjust eta for the smoothing of the variational fixed point equations. You can compare the results with exact cmputation (only recommendable for 2 joints) by setting METHOD='exact'. There is also an implementation of importance sampling (does not work very well) and Metropolis Hastings sampling (works nice, but not as stable as the variational approximation).