Workshop

The Statistical Physics of Inference and Control Theory

Granade, Spain
September 12-16 2012

SNN logo

SNN
Adaptive Intelligence


HOME
OVERVIEW
PROGRAM
ABSTRACTS
PARTICIPANTS
REGISTRATION | CONTACT
ACCESS | PRACTICAL INFORMATION

 

ABSTRACTS:

Bierkens, Joris  Radboud University Nijmegen Probabilistic solution of relative entropy weighted control. We show that stochastic control problems with a particular cost structure involving a relative entropy term admit a purely probabilistic solution, without the necessity of applying the dynamic programming principle. The argument is as follows.  Minimization of the expectation of a random variable with respect to the underlying probability measure, penalized by relative entropy, may be solved exactly. In the case where the randomness is generated by a standard Brownian motion, this exact solution can be written as a Girsanov density. The stochastic process appearing in the Girsanov exponent has the role of control process, and the relative entropy of the change of probability measure is equal to the integral of the square of this process. An explicit expression for the control process may be obtained in terms of the Malliavin derivative of the density process. The theory is applied to the problem of minimizing the maximum of a Brownian motion (penalized by the relative entropy), leading to an explicit expression of the optimal control law in this case. The theory is then applied to a stochastic process with jumps, illustrating the generality of the method. The link to linearization of the Hamilton-Jacobi-Bellman equation is made for the case of diffusion processes.
     
Braun, Daniel  and Pedro Ortega Max Planck Institut, Tuebingen Thermodynamics as a theory of bounded rational decision-making. Perfectly rational decision-makers maximize expected utility, but crucially ignore the resource costs incurred when determining optimal actions. Here we propose an information-theoretic formalization of bounded rational decision-making where decision-makers trade off expected utility and information processing costs. As a result, the decision-making problem can be rephrased in terms of well-known concepts from thermodynamics and statistical physics, such that the same exponential family distributions that govern statistical ensembles can be used to describe the stochastic choice behavior of bounded decision-makers. This framework does not only explain some well-known experimental deviations from expected utility theory, but also reproduces psychophysical choice pattern captured by diffusion-to-bound models. Furthermore, this framework allows rederiving a number of decision-making schemes including risk-sensitive and robust (minimax) decision-making as well as more recent approximately optimal schemes that are based on the relative entropy. In the limit when resource costs are ignored, the maximum expected utility principle is recovered. Since most of the mathematical machinery can be borrowed from statistical physics, the main contribution is to show how a thermodynamic model of bounded rationality can provide a unified view of diverse decision-making phenomena and control schemes.
     
Brocket, Roger  Harvard Minimal attention control. We may reasonably why optimal control theory has not been more useful in understanding the control mechanisms found in biology.  The questions there range from understanding control of the operation of an individual cell to the motor control of the complete organism. Given that evolution has had as long as it has to optimize brain and muscle/skeletal structures, why is that we don't find optimal control theory to be more effective in explaining these structures? Looking more critically at optimal control theory in an engineering setting, one observes that there are a great many applications in which the payoff for implementing an "optimal"  relationship between sensed signals and control variables does not justify the cost of the equipment needed to achieve it.  For example, in high volume consumer goods, such as dish washers and clothes dryers, it is inexpensive to sense the temperature of the water or air but the benefits  associated with  implementing a linear relationship between the temperature of the mixed water and the flow from the hot and cold water lines do not justify the cost.  Acceptable performance is obtainable  using a simple on-off control.  Even in the case of audio equipment, where there is a payoff for building systems that are very close to linear, the benefits of linearity are confined to finite range of amplitudes and a subset of  frequencies.  At the heart of the problem is the fact that standard optimal control theory provides no mechanisms to incorporate implementation costs.   In this talk we describe a formulation of control problems based on the Liouville equation that allows the designers to balance implementation costs with the quality of the resulting trajectories.
     
Chernyak, Volodya  Wayne State University Stochastic Control as a Non-Equilibrium Statistical Physics. In Stochastic Control (SC) one minimizes average cost-to-go,  consisting of the cost-of-control (amount of efforts), the cost-of-space (where one wants the system to be) and the target cost (where one wants the system to finish), for the system obeying a forced and controlled Langevien dynamics. We generalize the SC problem adding to the cost-to-go a term accounting for the cost-of dynamics, characterized by a vector potential. We provide variational derivation of the generalized gauge-invariant Bellman-Hamilton-Jacobi equation for the optimal average cost-to-go, where the control is expressed in terms of current and density functionals, and discuss examples, e.g.ergodic control of particle-on-a-circle illustrating non-equilibrium space-time complexity over current/flux. The talk is based on a joint work with M. Chertkov, J. Bierkens and H.J. Kappen
     
Chertkov, Misha  Los Alamos National Laboratory Fluid Mechanics as a Single-Particle Control.We show that different flows studied in fluid mechanics can be understood as following from generalized gauge-invariant stochastic control formulations of a single particle under different conditions.  We describe in this language of control theory compressible and incompressible fluid mechanics, and how governing equations for the control field becomes Navier-Stocks equations of different flavors. In a particular case of stochastic control without any strong (dynamically enforced) constraints and costs of space and dynamics varying/fluctuating in 1+1 dimensional time-space,  we arrive at the Burgulence (Burgers turbulence), decaying or forced depending on details of the setting. Different stages of development and interaction of shocks are interpreted in terms of control. This is a work in progress in collaboration with V. Chernyak and following from related work on ``Stochastic Control as a Non-Equilibrium Statistical Physics" with V. Chernyak, J. Bierkens and H.J. Kappen.
     
Dall' Asta, Luca  Politecnico of Turin Activation and Control of extreme trajectories in network dynamics Simple models of irreversible dynamics on networks have been successfully applied to describe cascade processes in many fields of research. However the problem of optimizing the trajectory of the system, e.g. obtain the largest propagation using the smallest set of seeds, is still considered to be practically intractable in large networks. I will present a new formulation of the problem that exploits the cavity method to develop an efficient message passing algorithm that solves several types of spread optimization processes on very large networks.  In the controlled setup provided by random graphs I will also discuss the mechanisms underlying the optimized dynamics, showing that the performances depend on the strength of the dynamical cooperative effects.
     
Delvenne, Jean-Charles  University of Louvain Thermodynamics and linear control theory. We show how the connect the tools of linear control theory - dissipativity, Kalman filtering, time-scale separation, etc. – with the main results of thermodynamics - First and Second Laws, Carnot's theorem, Fourier-Cattaneo's law, finite-time thermodynamics- in a consistent and rigorous way. Joint work with Henrik Sandberg and John C. Doyle.
     
Dvijotham, DJ, Krishnamurthy  University of Washington Generalizations of Linearly Solvable Optimal Control. We present various generalizations of the theory of linearly solvable optimal control. One generalization is the extension to the game theoretic and risk-sensitive setting, which requires replacing KL costs with Renyi divergences. Another extension is the derivation of efficient policy gradient algorithms that only need one to sample the state space (rather than the state-action space) for various problem formulations (finite horizon, infinite horizon, first-exit, discounted).  For the finite horizon case, we show that the PI^2 algorithm (Policy Improvement with Path Integrals) can be seen as a special case of our policy gradient algorithms when applied to a risk-seeking objective. Finally, we present applications of these policy gradient algorithms to various problems in movement control and power systems.
     
Friston, Karl  University College London  Free energy and active inference. How much about our interactions with - and experience of - our world can be deduced from basic principles? This talk reviews recent attempts to understand the self-organised behaviour of embodied agents, like ourselves, as satisfying basic imperatives for sustained exchanges with the environment. In brief, one simple driving force appears to explain many aspects of action and perception. This driving force is the minimisation of surprise or prediction error that - in the context of perception - corresponds to Bayes-optimal predictive coding (that suppresses exteroceptive prediction errors) and - in the context of action - reduces to classical motor reflexes (that suppress proprioceptive prediction errors). We will look at some of the phenomena that emerge from this single principle; such as the perceptual encoding of sensory trajectories (bird song and action perception). These perceptual abilities rest upon prior beliefs about the world – but where do these beliefs come from? I will finish by discussing recent proposals about the nature of prior beliefs and how they underwrite the active sampling of the sensorium. Put simply, to minimise surprising states of the world, it is necessary to sample inputs that minimise uncertainty about the causes of sensory input. When this minimisation is implemented via prior beliefs - about how we sample the world - the resulting behaviour is remarkably reminiscent of searches of the sort seen in exploration and visual searches.
     
Guerra, Francesco  University of Rome Stochastic variational principles for dissipative and conservative systems: the problem of time reversal invariance. In the frame of the method of statistical ensembles, relaxation to thermodynamic equilibrium can be efficiently described by using stochastic differential equations. This scheme is intrinsically non-invariant with respect to time reversal. However, we can consider an additional dynamical variable, called the importance function, whose meaning and motivation arises from the neutron diffusion theory. The resulting scheme is now time reversal invariant, with a complete symplectic structure. The equations for the density and the importance function are Hamilton equations for a properly chosen Hamiltonian, and obey a stochastic variational principle. On the other hand, we can consider the formulation of quantum mechanics, according to the stochastic scheme devised by Edward Nelson. In this frame a stochastic variational principle can be easily introduced. The theory is intrinsically time reversal invariant. We have still a symplectic structure, and canonical Hamilton equations. Here the conjugated variables are the quantum mechanical probability density and the phase of the wave function. We give a synthetic description of the two schemes, by pointing out their structural similarity, and their deep physical difference. 
     
Handel, Ramon van  Princeton University Filtering in High Dimension and Statistical Physics. The conditional ergodic theory of stochastic models is essential to understanding the behavior of nonlinear filtering algorithms, which are widely used in applications ranging from navigation and robotics to data assimilation.  Perhaps the major outstanding problem in this area, both mathematically and algorithmically, is that all known methods break down in systems that possess a high-dimensional state space.  In this talk, I will argue that such problems are closely connected with models from statistical physics, including random walks in random environments, Ising-type spin glasses, and interacting particle systems.  I will also explain how a better understanding of these connections might lead to new types of filtering algorithms that can avoid the curse of dimensionality.
     
Hurtado, Pablo / Garrido, Pedro Universidad de Granada Spontaneous Symmetry Breaking at the Fluctuating Level Phase transitions not allowed in equilibrium steady states may happen however at the fluctuating level. We observe for the first time this striking and general phenomenon measuring current fluctuations in an isolated diffusive system. While small fluctuations result from the sum of weakly-correlated local events, for currents above a critical threshold the system self-organizes into a coherent traveling wave which facilitates the current deviation by gathering energy in a localized packet, thus breaking translation invariance. This results in Gaussian statistics for small fluctuations but non-Gaussian tails above the critical current. Our observations, which agree with predictions derived from hydrodynamic fluctuation theory, strongly suggest that rare events are generically associated with coherent, self-organized patterns which enhance their probability.
     
Kappen, Bert  SNN, Radboud University Nijmegen The statistical physics of control and inference. In this introductory talk, I will present my personal perspective on how stochastic control theory is related to quantum mechanics, statistical inference, statistical physics and large deviation
theory. It provides also my motivation for this conference.  The talk consists of three parts. In the first part, I will review the ideas of Nelson and Guerra that relate the Schr\"odinger equation to a class of stochastic control problems. The second part revisits the path integral control theory that builds on the early work of Fleming and Mitter. These control problems are intimately related to statistical inference and statistical physics, The third part of the talk discusses an idea originally formulated by Schrödinger how control of a density naturally leads to a large deviation principle. The rate function is identified as the Kullback-Leibler divergence that also plays a central role in the path integral control theory.
     
Landim, Claudio   University of Rouen A thermodynamical theory for non-equilibrium systems. We present ``physical theory'' for a certain class of thermodynamic systems out of equilibrium, which is founded on and supported by the analysis of a large family of stochastic microscopic models. We describe the non-equilibrium stationary states, and the non-equilibrium free energy of the system.  and we examine the situation in which a stationary state is driven to another stationary state by varying the external parameters on the macroscopic time scale.
     
Lloyd, Seth  MIT Quantum limits to measuring space and time. This talk derives fundamental limits to the accuracy with which clocks and signals -- e.g. the global positioning system (GPS)-- can measure space and time. By combining quantum limits to the accuracy of measurement with the requirement that the local energy density of clocks and signals be less than the black hole energy density, I derive the quantum geometric limit: the total number of ticks of clocks and clicks of detectors that can take place in a volume of spacetime of radius R over time T is no greater than R times T divided by the Planck length times the Planck time.
     
Maassen, Hans Radboud University Nijmegen Optimal quantum feedback control. We give a basic account of quantum filtering and control from the point of view of quantum probability and information theory. We describe and prove the "no-cloning" principle, and Heisenberg's related principle which says that the leakage of information to the environment necesserily implies. a deterioration of the quantum state. A typical challenge to quantum technology issuing from this principle
is to recover the leaked information and to feed it back into the system under control in order to preserve its state. We show that in the situation of discrete time and complete recovery such feedback control indeed enables us to stabilize any particular quantum state, pure or mixed. We illustrate this fact by an example where it is actually well-known: the fluorescent two-level atom (in a discrete time setting).
     
Mézard, Marc  Universite de Paris Sud, Orsay Occam's razor in massive data acquisition: a statistical physics approach. Acquiring a large amount of information in short time is crucial for many tasks in control. Compressed sensing is triggering a major evolution in signal acquisition. It consists in sampling a sparse signal at low rate and later using computational power for its exact reconstruction, so that only the necessary information is measured. Currently used reconstruction techniques are, however, limited to acquisition rates larger than the true density of the signal.  We shall describe a new procedure which is able to reconstruct exactly the signal with a number of measurements that approaches the theoretical limit in the limit of large systems. It is based on the joint use of three essential ingredients: a probabilistic approach to signal reconstruction, a message-passing algorithm adapted from belief propagation, and a careful design of the measurement matrix inspired from the theory of crystal nucleation. F. Krzakala, M. Mezard, F. Sausset, Y. Sun and L.  Zdeborova, Phys. Rev. X 2 (2012) 021005
     
Mitter, Sanjoy  MIT The Duality between Estimation and Control. In this talk I discuss Bayesian Inference as a problem in Free Energy Minimization and the Fenchel-Legendre Duality related to this viewpoint . I specialize this to the estimation problem for nonlinear diffusions with nonlinear observations and show how this gives a stochastic control interpretation for the nonlinear estimation problem . Finally, I discuss how Shannon's Noisy Channel Coding Theorem could be viewed as a Bayesian Inference problem and demonstrate its similarities to the Gibbs variational Principle.
     
Morimoto, Jun  ATR Kyoto Applications of stochastic optimal control methods to humanoid robots. We introduce our attempts to use stochastic optimal control methods for humanoid robot control. In particular, we propose a phase-dependent policy optimization method for Central Pattern Generator (CPG)-based periodic movement controllers. We use the synchronization property of the CPG to modulate the periodic patterns and explicitly take the phase information of the CPG into account for policy optimization. As a concrete example, we consider biped walking problems. To optimize the walking policy, a model-free optimal control method is preferable because precise modeling of the ground contact is difficult. On the other hand, model-free trajectory optimization methods have been considered as quite computationally demanding approach. However, because of recent advances in the nonlinear trajectory optimization method, using the model-free optimization method is now a realistic approach for biped trajectory optimization. The main outcomes of this study are two-folds: 1) we empirically show that a path integral reinforcement learning method can be used to improve the biped walking trajectory and to design the local feedback controller around the trajectory for the CPG-based controller in the high-dimensional state space, and 2) we also show that the phase-dependent trajectory optimization can improve the walking policies faster than the traditional time-dependent optimization.
     
Nakano, Yumiharu Tokyo Institute of Technology An approximation scheme for optimal stochastic control problems. We propose a simple time-discretization scheme for multi-dimensional optimal stochastic control problems.  It is based on a probabilistic representation for the convolution of the value function by a probability density function. A resulting numerical method allows us to use an uncontrolled Markov process to estimate the conditional expectations in the dynamic programming procedure. Moreover, it can be implemented without the interpolation of the value function or the modification of the diffusion matrix. We show the convergence results under quite mild conditions on coefficients of the problems by Barles--Souganidis viscosity solution method. [1]F.Camilli and M.Falcone, An approximation scheme for the optimal control of diffusion processes, Math.Model.Numer.Anal., 29 (1995), 97-122. [2]A.Fahim, N.Touzi, and X.Warin, A probabilistic numerical method for fully nonlinear parabolic PDEs, Ann.Appl.Probab., 21 (2011), 1322-1364. [3]H.J.Kushner and P.Dupuis, Numerical methods for stochastic control problems in continuous time, Springer-Verlag, New York, 2001. 
     
Parrilo, Pablo A.  MIT An Optimal Architecture for Decentralized Control Over Posets. Partially ordered sets (posets) provide a natural way of modeling problems where communication constraints between subsystems have a hierarchical or causal structure. In this talk, we consider general poset-causal decentralized decision-making problems, and describe a simple and natural controller architecture based on the Moebius transform of the poset. This architecture possesses simple and appealing separation properties. In particular, we show how our earlier results on H2-optimal decentralized control for arbitrary posets can be cast exactly into this framework, by exploiting a key separation property of the H2 norm, and the incidence algebra of the poset.  Joint work with Parikshit Shah (U. Wisconsin).
     
Parrondo, Juan  Universidad Complutense Madrid Maxwell demons, feedback control, and fluctuation theorems. As illustrated by the Maxwell demon and its sequels, feedback can be utilized to convert information into useful work. The recently developed fluctuation theorems turn out to be a powerful tool to analyze the energetics of feedback controlled systems. Using these theorems, we devise a method for designing optimal feedback protocols for  thermodynamic engines that extract all the information gained during feedback as work. Our method is based on the observation that in a feedback-reversible process the measurement and the time-reversal of the ensuing protocol both prepare the system in the same probabilistic state. We illustrate the utility of our method with two examples of  the multi-particle Szilard engine.
     
Sandberg, Henrik  KTH Royal Institute of Technology Physical implementations and limitations in control theory. We discuss how and when one can implement and approximate given active and passive systems using lossless/Hamiltonian systems. In particular, we show that lossless systems are dense in the passive systems in a certain sense. We furthermore discuss how laws from statistical physics, such as the fluctuation-dissipation theorem, give rise to meaningful limitations in control problems. Joint work with Jean-Charles Delvenne and John C. Doyle.
     
Satoh, Satoshi Hiroshima University Iterative Path Integral Method for Nonlinear. Stochastic Optimal Control. So far, we have been studying nonlinear stochastic control. For example, in [1, 2, 3], we have proposed an asymptotically stabilization method based on properties of physical systems such as passivity and invariance for a class of nonlinear stochastic systems. Besides, in [4, 5], we have proposed a stochastic bounded stabilization controller, which renders the state of the plant system bounded in probability for given probability and bounds of the state. The main subject of this talk is nonlinear optimal control, and I would like to introduce our recent research with Prof. Bert Kappen, on extension of the path integral stochastic optimal control method. Nonlinear stochastic optimal control problem is reduced to solving the stochastic Hamilton- Jacobi-Bellman (SHJB) equation. However, it is generally quite difficult to solve the SHJB equation, because it is a second-order nonlinear PDE. The path integral method proposed by Kappen [6] provides an efficient solution for a SHJB equation corresponding to a class of nonlinear stochastic optimal control problems, based on statistical physics approach. Although this method is very useful, some assumptions required in this method restrict its application. To solve this problem, we have proposed an iterative solution for the path integral method in our report [7]. The proposed method solves the SHJB equation iteratively without imposing the assumptions, which are necessary in the conventional method. Consequently, it enables us to solve a wider class of stochastic optimal control problems based on the path integral approach. Since the proposed method reduces to the conventional method when the assumptions hold, it is considered to be a natural extension of the conventional result. Furthermore, we investigate a convergence property of the algorithm.[1] S. Satoh and K. Fujimoto, "On passivity based control of stochastic port-Hamiltonian systems," in Proc. 47th IEEE Conf. on Decision and Control , 2008, pp. 4951-4956. [2] --, "Passivity based control of stochastic port-Hamiltonian systems," Trans. the Society of Instrument and Control Engineers, vol. 44, no. 8, pp. 670-677, 2008, (in Japanese). [3] --, "Stabilization of time-varying stochastic port-Hamiltonian systems based on stochastic passivity," in Proc. IFAC Symp. Nonlinear Control Systems, 2010, pp. 611-616. [4] --, "Observer based stochastic trajectory tracking control of mechanical systems," in Proc. ICROS-SICE Int. Joint Conf. 2009, 2009, pp. 1244-1248. [5] S. Satoh and M. Saeki, "Bounded stabilization of a class of stochastic port-Hamiltonian systems," in Proc. 20th Symp. Mathematical Theory of Networks and Systems, 2012, pp. (CD-ROM) 0150. [6] H. J. Kappen, "Path integrals and symmetry breaking for optimal control theory," J. Statistical Mechanics: Theory and Experiment, p. P11011, 2005. [7] S. Satoh, H. J. Kappen, and M. Saeki, "A solution method for nonlinear stochastic optimal control based on path integrals," in Proc. 12th SICE System Integration Division Annual Conf., 2012, p. P0194, (in Japanese). 
     
Seldin, Yevgeny  Max Planck Institut, Tuebingen PAC-Bayesian Analysis: A link between inference and statistical physics. PAC-Bayesian analysis is a general tool for deriving generalization bounds for a wide class of inference rules. Interestingly, PAC-Bayesian generalization bounds take a form of a trade-off between the empirical performance of the inference rule and the KL-divergence between the posterior distribution over the hypothesis space applied by the inference rule and a prior distribution over the hypothesis space. This form of a trade-off is closely related to the free energy in statistical physics. Moreover, PAC-Bayesian bounds can be used in order to determine the right "temperature" at which the system should be analyzed given a finite sample. In other words, PAC-Bayesian analysis introduces a principled way of treating finite samples in application of methods from statistical physics to inference. We present a generalization of PAC-Bayesian analysis to martingales. This generalization makes it possible to apply PAC-Bayesian analysis to time-evolving processes, including importance-weighted sampling, reinforcement learning, and many other domains. References: [1] Yevgeny Seldin, François Laviolette, Nicolò Cesa-Bianchi, John Shawe-Taylor, and Peter Auer. PAC-Bayesian inequalities for martingales. IEEE Transactions on Information Theory, 2012. Accepted. [2] Yevgeny Seldin, Peter Auer, François Laviolette, John Shawe-Taylor, and Ronald Ortner. PAC-Bayesian analysis of contextual bandits. In Advances in Neural Information Processing Systems (NIPS) 25, 2011.
     
Shah, Devavrat  MIT Making centralized computation `faster, distributed' and sometimes `better'. I will introduce a generic method for approximate inference in graphical models using graph partitioning. The resulting algorithm is linear time and provides an excellent approximation for the maximum a posteriori assignment (MAP) in a larger class of graphical model including any graph with `polynomial growth' and graph that exclude fixed minors (e.g. planar graphs). In general, the algorithm can be thought of as a "meta" algorithm that can be used to speed up any existing inference algorithm without losing performance. The goal of the talk is to primarily introduce the algorithm and provide insights into why such a simplistic algorithm works. Time permitting, I will also discuss its implication for `modularity clustering' that has been popularly utilized in processing networked data.
     
Theodorou, Evangelos   Washington University Free Energy and Relative Entropy Dualities: Connections to path integral control and applications to robotics. While optimal control and reinforcement learning are fundamental frameworks for learning and control applications, their application to high dimensional control systems of the complexity of humanoid and biomimetic robots has largely been impossible so far. Among the key problems are that classical value function-based approaches run into severe limitations in continuous state-action spaces due to issues of value function approximation. Additionally, the computational complexity and time of exploring high dimensional state-action spaces quickly exceeds practical feasibility. As an alternative, researchers have turned into trajectory-based reinforcement learning, which sacri#ces global optimality in favor of being applicable to high-dimensional state-action spaces. Model-based algorithms, inspired by ideas of differential dynamic programming, have demonstrated some success if models are accurate. Model-free trajectory-based reinforcement learning has been limited by problems of slow learning and the need to tune many open parameters. Recently reinforcement learning has moved towards combining classical techniques from stochastic optimal optimal control and dynamic programming with learning techniques from statistical estimation theory and the connection between SDEs and PDEs via the Feynman-Kac Lemma. In this talk, I will discuss theoretical developments and extensions of path integral control to iterative cases and present algorithms for policy improvement in continuous state actions spaces. I will provide Information theoretic interpretations and extensions based on the fundamental relationship between free energy and relative entropy. The aforementioned relationship provides an alternative view of stochastic optimal control theory that does not rely on the Bellman principle. I will demonstrate the applicability of the proposed algorithms to control and learning of humanoid, manipulator and tendon driven robots and propose future directions in terms of theory and applications.
     
Tishby, Naftali  The Hebrew University On the discounting of information and the renormalization group of the Bellman equation. We argue that consistent formulation of optimal sensing and control must include information terms, yielding an extension of the standard POMDP setting. To make the standard reward/costs terms consistent with the information terms, while still allowing tractable computation, the standard uniformity of time must be altered. We argue that this can be done by successive refinement of the information-value tradeoff, which also leads to the emergence of hierarchies and reverse-hierarchies for both perception and planning
     
Todorov, Emanuel  Washington University From linearly-solvable optimal control to trajectory optimization, and (hopefully) back. We have identified a general class of stochastic optimal control problems which are inherently linear, in the sense that the exponentiated optimal value function satisfies a linear equation. These problems have a number of unique properties which enable more efficient numerical methods than generic formulations. However, after several attempts to go beyond the simple numerical examples characteristic of this literature and scale to real-world problems (particularly in robotics), we realized that the curse of dimensionality is still a curse. We then took a detour, and developed trajectory optimization methods that can synthesize remarkably complex behaviors fully automatically. Thanks to the parallel processing capabilities of modern computers, some of these methods work in real time in model-predictive-control (MPC) mode, giving rise to implicitly defined feedback control laws. But not all problems can be solved in this way, and furthermore it would be nice to somehow re-use the local solutions that MPC generates. The next step is to combine the strengths of these two approaches: using trajectory optimization to identify the regions of state space where the optimally-controlled stochastic system is likely to spend its time, and then applying linearly-solvable optimal control restricted to these regions.
     
Wiegerinck, W  / Lessmann, P. SNN, Radboud University Nijmegen Large deviation theory for inference and control. Recently, it has been demonstrated that an interesting class of stochastic control problems can be mapped onto a Bayesian inference problem. For these problems, approximate inference methods can be applied to efficiently compute approximate optimal controls. In this talk we consider an approximate inference approach based on the large deviation principle. The posterior distribution of the quantity of interest is approximated by an exponentiated rate function. This rate function is obtained by solving a deterministic inference/control problem, which is assumed to be easier to compute, e.g. by dynamic programming. Some control examples are discussed.
     
Zecchina, Riccardo  University of Torino

From stochastic optimization techniques to control algorithms for dynamical cascade processes . We will discuss how  statistical physics techniques  developed for the study of stochastic optimization problems can be used to design efficient algorithms for analyzing, controlling and activating extreme trajectories in a variety of cascade processes over graphs and lattices, in which nodes “activate” depending on the state of their neighbors. The problem is in general intractable, with the exception of models that satisfy a sort of diminishing returns property called submodularity (submodular models can be approximately solved by means of greedy strategies, but by definition they lack cooperative characteristics which are fundamental in many real systems). We show that for a wide class of irreversible dynamics, even in the absence of submodularity, the problem of activating  rare trajectories in the  dynamics can be solved efficiently on large networks. Examples of preliminary applications range from the  maximization of the spread of influence in Threshold Linear Models (Bootstrap percolation) to the minimization of infection processes in SIR models.