Tutorial on Stochastic optimal control theory, ICTP Summer school 2012 on Machine learning, Trieste august 2012

Course material:

Topic Material Recommended exercises
1 Discrete time control
dynamic programming
Bellman equation
Bertsekas 2-5, 13-14, 18, 21-32 (2nd ed.)
Bertsekas 2-5, 10-12, 16-27, 30-32 (1nd ed.)
Kappen ICML tutorial 1.2
Bertsekas 1.1 a and b, 1.2
2 Continuous time control
Hamilton-Jacobi-Bellman Equation
Pontryagin Minimum Principle
Stochastic differential equations
Kappen ICML tutorial 1.3
extra exercise 1, 2a,b
3 Stochastic optimal control
LQ examples, Portfolio management
Path integral control theory
Kappen ICML tutorial 1.4, 1.6
extra exercise 2c, 3
4 Path integral control theory
Delayed choice example
Importance sampling
Laplace approximation
How to control a device?
KL control theory and link to path integrals
Multi-agent systems
Stationary KL control
(Dual control: the problem of joint inference and control)
(Risk sensitive control)
(Numerical examples (particle in a box, Darts, N joint arm, Coordination of agents)
Kappen ICML tutorial 1.6, 1.7, (1.5)
Theodorou et al., AISTATS 2010
Mensink et al., ECAI 2010
van den Broek et al., JAIR 2008
van den Broek et al., UAI 2010
Kappen et al, arxiv:0901.0633 2009
extra exercise 4,5 Matlab code for n joint problem
Here is a directory of matlab files, which allows you to run and inspect the variational approximation for the n joint stochastic control problem as discussed in the tutorial text section 1.6.7. Type tar xvf njoints.tar to unpack the directory and simply run file1.m. In file1.m you can select demo1 (3 joint arm) or demo2 (10 joint arm). You can also try larger n but be sure to adjust eta for the smoothing of the variational fixed point equations. You can compare the results with exact cmputation (only recommendable for 2 joints) by setting METHOD='exact'. There is also an implementation of importance sampling (does not work very well) and Metropolis Hastings sampling (works nice, but not as stable as the variational approximation).