Short course on control theory and dynamic programming - Gatsby Unit UCL Feb 2011

Course information

Lecturer: www.snn.ru.nl/~bertk Bert Kappen

Course material:

The course provides an introduction to stochastic optimal control theory. The course is based on chapter 1 from the book Dynamic programming and optimal control by Dimitri Bertsekas and part of my tutorial at ICML 2008.

chapter 1 from the book Dynamic programming and optimal control by Dimitri Bertsekas. Copies 1a Copies 1b Here are his slides for Bertsekas' course.
my ICML 2008 tutorial text will be published in a book Inference and Learning in Dynamical Models (Cambridge University Press 2010), edited by David Barber, Taylan Cemgil and Sylvia Chiappa. The ICML 2008 tutorial website containts other tutorial material and pointers to useful further material.
Extensions of that material based using both published and unpublished results of the last years.
These are the slides for the course.

Date Topic Material Recommended exercises
1 Feb 8
11-13 hours Discrete time control
dynamic programming
Bellman equation
Continuous time control
Hamilton-Jacobi-Bellman Equation
Pontryagin Minimum Principle Bertsekas 2-5, 13-14, 18, 21-32 (2nd ed.)
Bertsekas 2-5, 10-12, 16-27, 30-32 (1nd ed.)
Kappen ICML tutorial 1.2, 1.3
slides Bertsekas 1.1 a and b, 1.2 extra exercise 1, 2a,b

2 Feb 9
11-13 hours Recap of PMP and examples
Inverse control
Stochastic differential equations
Stochastic optimal control
LQ examples, Portfolio management
Kappen ICML tutorial 1.4
slides
3 Feb 21
10-13 hours Dual control: the problem of joint inference and control
Path integral control theory
Kappen ICML tutorial 1.5, 1.6, 1.7
extra exercise 2c, 3
4 Feb 22
10-13 hours Path integral control theory
MC Sampling solution
Laplace approximation
Numerical examples (particle in a box, Darts, N joint arm, Coordination of agents, Robot learning)
Risk sensitive control
KL control theory and link to path integrals Kappen ICML tutorial 1.7
Theodorou et al., AISTATS 2010
Mensink et al., ECAI 2010
van den Broek et al., JAIR 2008
van den Broek et al., UAI 2010
Kappen et al, arxiv:0901.0633 2009 extra exercise 4,5 Matlab code for n joint problem
Here is a directory of matlab files, which allows you to run and inspect the variational approximation for the n joint stochastic control problem as discussed in the tutorial text section 1.6.7. Type tar xvf njoints.tar to unpack the directory and simply run file1.m. In file1.m you can select demo1 (3 joint arm) or demo2 (10 joint arm). You can also try larger n but be sure to adjust eta for the smoothing of the variational fixed point equations. You can compare the results with exact cmputation (only recommendable for 2 joints) by setting METHOD='exact'. There is also an implementation of importance sampling (does not work very well) and Metropolis Hastings sampling (works nice, but not as stable as the variational approximation).

	Date	Topic	Material	Recommended exercises
1	Feb 8 11-13 hours	Discrete time control dynamic programming Bellman equation Continuous time control Hamilton-Jacobi-Bellman Equation Pontryagin Minimum Principle	Bertsekas 2-5, 13-14, 18, 21-32 (2nd ed.) Bertsekas 2-5, 10-12, 16-27, 30-32 (1nd ed.) Kappen ICML tutorial 1.2, 1.3 slides	Bertsekas 1.1 a and b, 1.2 extra exercise 1, 2a,b
2	Feb 9 11-13 hours	Recap of PMP and examples Inverse control Stochastic differential equations Stochastic optimal control LQ examples, Portfolio management	Kappen ICML tutorial 1.4 slides
3	Feb 21 10-13 hours	Dual control: the problem of joint inference and control Path integral control theory	Kappen ICML tutorial 1.5, 1.6, 1.7	extra exercise 2c, 3
4	Feb 22 10-13 hours	Path integral control theory MC Sampling solution Laplace approximation Numerical examples (particle in a box, Darts, N joint arm, Coordination of agents, Robot learning) Risk sensitive control KL control theory and link to path integrals	Kappen ICML tutorial 1.7 Theodorou et al., AISTATS 2010 Mensink et al., ECAI 2010 van den Broek et al., JAIR 2008 van den Broek et al., UAI 2010 Kappen et al, arxiv:0901.0633 2009	extra exercise 4,5 Matlab code for n joint problem Here is a directory of matlab files, which allows you to run and inspect the variational approximation for the n joint stochastic control problem as discussed in the tutorial text section 1.6.7. Type tar xvf njoints.tar to unpack the directory and simply run file1.m. In file1.m you can select demo1 (3 joint arm) or demo2 (10 joint arm). You can also try larger n but be sure to adjust eta for the smoothing of the variational fixed point equations. You can compare the results with exact cmputation (only recommendable for 2 joints) by setting METHOD='exact'. There is also an implementation of importance sampling (does not work very well) and Metropolis Hastings sampling (works nice, but not as stable as the variational approximation).