Planning of actions presents a major computational challenge for animals as well as robots. Biological systems as well as AI systems must operate under high uncertainty. This uncertainty has many sources, for instance because of the unknown behavior of other animals or because of the noise induced by the limitations in the sensors. Thus, control of such systems is very different from an industrial robotic setting, where full knowledge can be assumed and noise can be almost ignored.

Optimizing a sequence of actions to attain some future goal is the general topic of control theory. The general stochastic control problem is intractable to solve and requires an exponential amount of memory and computation time. The reason is that the state space needs to be discretized and thus becomes exponentially large in the number of dimensions. Starting in 2004, we have proposed a novel class of stochastic control problems using path integrals that can be mapped onto a Bayesian inference problem. Because of its typical statistical mechanics form, one can consider various ways to approximate this path integral, such as the Laplace approximation, Monte Carlo sampling, mean field approximations or belief propagation.

Several robotics research groups world-wide have used this approach and shown that it significantly outperforms other state-of-the-art reinforcement methods. We are currently applying these methods coordinate the movement of Quadrotors (with UCL) and collaborate with Satoshi Satoh (ATR Japan), Takamitsu Matsubara (NAIST Japan) and Jan Peters (TU Darmstadt) on the specific issues related to the applicability of control theory to robotics.

The path integral theory makes quantitative predictions about optimal planning under uncertainty. One such prediction is the phenomenon of delayed choice: when uncertain about the future, it is wise, and optimal according to the theory, to delay a decision. The phenomenon is demonstrated in the Kappenball app (available on itunes.apple.com)

Current research aims at improving path integral control methods. We recently showed that the optimal control provides also an optimal sampling procedure. This provides a theoretical basis for the design of improved adaptive sampling schemes that 'learn' the optimal control. Furthermore, we provide novel theory that shows how to construct feed-back controllers with arbitrary complex state dependence.

International Journal of Control,
pp. 1-8,
2018

Role of synaptic stochasticity in training low-precision neural netwtorks.

Physical Review Letters,
vol. 120,
no. 26,
pp. 268103-1-6,
2018
Nonlinear deconvolution by sampling biophysically plausible hemodynamic models.

arxiv,
2018
Effective connectivity from single trial fmri data by sampling biologically plausible models.

arxiv,
2018
Consistent adaptive multiple importance sampling and controlled diffusions.

arxiv,
2018
All SNN publications