Keywords: Bayesian inference, learning and reasoning, stochastic control theory, neural networks, statistical physics
One of the marked differences between computers and animals is the ability of the latter to learn and flexibly adapt to
changing situations. Whereas computers need to be programmed with provisions for all possible future circumstances, the
brain adapts its 'program' when needed, striking a remarkable balance between flexibility to adapt on the one hand and
persistence by re-using pre-learned facts and skills on the other. Examples of such intelligent behavior
are pattern recognition, learning and memory, reasoning, planning and motor control.
Due to the essential roles that noise and uncertainty play in perception and learning, a useful way to model intelligence
is to use probability models. In the mid 90s, the fields of analog and digital computing as separate approaches to model
intelligence, have begun to merge using the idea of Bayesian inference: One can generalize the logic of digital computation
to a probabilistic calculus, embodied in a so-called graphical model. Similarly, one can generalize dynamical systems to
stochastic dynamical systems that allow for a probabilistic description in terms of a Markov process. The Bayesian paradigm
has greatly helped to integrate different schools of thought in particular in the field of artificial intelligence and
machine learning but also provides a computational paradigm for neuroscience.
My research is dedicated to the design of efficient computational methods for Bayesian inference and stochastic
control theory using ideas and methods from statistical physics and quantum physics.
The aim of this research is to advance
artificial intelligence research and computational models of brain function.
Bayesian Inference
Bayesian models are probability models and the typical computation, whether in the context of a complex data analysis
problem or in a stochastic neural network, is to compute an expectation value, which is referred to as Bayesian inference.
Bayesian inference is intractable, which means that computation time and memory use scale exponentially with the problem
size. However, many methods exist to compute these quantities approximately. Most of these methods origin from statistical
physics, such as the mean field method, belief propagation or Monte Carlo sampling. Application of these methods to machine
learning problems is challenging and an active field of research to which I have made several contributions.
Current projects focus on the application of these ideas in concrete problems:
The efficient approximate inference methods allow the design of large artificial reasoning systems. Currently, we are
designing a diagnostic decision support system for internal medicine consisting of thousands of diagnoses, that should
help the doctor during the diagnostic process (in collaboration with Radboud academic hospital).
Design of high-dimensional Bayesian data analysis methods. The motivation is that Bayesian integration of the posterior
distribution improves the statistical power of these methods compared to the maximum likelihood approaches. Approximate
inference is used to efficiently compute statistics in the posterior distribution. One example is the use of the mean
field approximation for sparse L0 regression. Another example is Gaussian Process regression with Monte Carlo
sampling. In this case we have shown for yeast data that this method significantly outperforms all other methods and is
able to identify novel genetic causes. In addition, these methods are applied to analyse neuroscience data (EEG, fMRI, MEG)
for instance to find connectivity between brain regions.
Control theory
Control theory is a theory from engineering that gives a formal description of how a system, such as a robot or animal, can
move from a current state to a future state at minimal cost, where cost can mean time spent, or energy spent or any other
quantity. Control theory is used traditionally to control industrial plants, airplanes or missiles, but is also the natural
framework to model intelligent behavior in animals or robots. The mathematical formulation of deterministic control theory
is very similar to classical mechanics. In fact, classical mechanics can be viewed as a special case of control theory.
Stochastic control theory uses the language of stochastic differential equations. For a certain class of stochastic
control problems, the solution is described by a linear partial differential equation that can be solved formally as a path
integral. This so-called path integral control method provides a deep link between control, inference and statistical
physics. This statistical physics view of control theory shows that qualitative different control solutions exist for
different noise levels separated by phase transitions. The control solutions can be computed using efficient approximate
inference methods such as Monte Carlo sampling or deterministic approximation methods. The path integral control theory is
successfully being used by leading research groups in robotics world wide. For more information see the
path integral control theory page.
Computational neuroscience
A line of research that was started in 2000 and is still continued,
is on the effect that short-term synaptic plasticity has on memory
storage. The common understanding of long term memory is that it is stored
in the synaptic connections between neurons in such a way that memory
retrieval occurs as the relaxation of the neural activity to a constant
spiking pattern, that represents the memory. This idea was put forward
by Hopfield (1982) and others as the attractor neural network. Synaptic
dynamics challenges this mechanism, since persistent pre-synaptic
activity typically weakens the synaptic strength. The inclusion
of short-term synaptic plasticity in an attractor neural network
make memories metastable states that rapidly switch from one state
to the next, depending on the sensory context. This work provides
some insights on the puzzle how the brain, viewed as a dynamical
system, is able to build stable representations of the world and at
the same time is capable to effortlessly switch between them (with Joaquin Torres, University of Granada).
We recently addressed the question how the path integral control computation can be implemented in stochastic neural networks.
We demonstrated that the samples generated by such networks provide the data to learn a feed-back controller. Using this approach we
have demonstrated to control a stochastic inverted pendulum (Thalmeier
et al. 2016).
Current successes in machine learning has ignited interesting new
connections between machine learning and quantum physics, loosely referred to
as quantum machine learning. Quantum annealing has been successfully
applied to optimisation problems that arise in machine learning. Machine
learning methods also find useful applications in quantum physics, such as
characterizing the ground state of a quantum Hamiltonian or to learn
different phases of matter.
Since 2018, I am interested in how the quantum formalism can be used to advance
machine learning. The objective is two-fold. One is to exploit quantum
properties, such as entanglement, in classical data analysis.
The second is to accellerate learning
by implementing such models on quantum hardware.
This
article
on the quantum Boltzmann machine proposes a method to learn a quantum model from quantum or classical data
This
article
argues why adiabatic quantum annealing is unlikely to yield speed-up
Bayesian methods have a big potential for
immediate application in areas outside science. There is a long-standing and quite
unique tradition
in the SNN group to build such application together with her spin-off companies Smart Research and Promedas.
Here are a few examples:
Genetic inference
We have applied an advanced approximate
inference method (the Cluster Variation Method) to construct haplotypes in complex pedigrees.
The
method was shown to outperform the state-of-the-art Monte Carlo sampling
approach on a subset of problems. The software is
publicly available. Contact Kees Albers for details caa at sanger dot ac dot uk.
Aladin is a software tool for performing efficient linkage analysis of a small number of distantly-related individuals. It estimates multipoint IBD probabilities and parametric LOD scores. Contact Kees Albers for details caa at sanger dot ac dot uk.
Oil exploration
For Shell, we built a petrophysical expert system.
It estimates the type of soil and the probability that it contains oil, gas
or other valuable minerals, based on drilling measurements. The system is
based on a Bayesian network where the probability computation is done using
a Monte Carlo sampling method. See Smart Research for further details and other products.
Victim identification
For the Netherlands Forensic Institute, we are
building a victim identification system by matching of
their DNA profiles against the Pedigrees of Relatives from Missing Person's
DNA profiles in large databases, using a Bayesian network. See
Bonaparte for further details.
Promedas
We have built the world largest and most up-to-date medical
expert system for diagnostic advice in internal medicine. The system is
being commercialized by Promedas bv. The system is since end 2008 operational
at the Utrecht academic hospital. See
Promedas for further details.
Wine and food
We have built a system that selects the most appropriate wines
to combine with your food Wine wine wine.
There is an exciting possibility to use a quantum mechanical wave function to represent a
probability distribution. While classically the probability distribution p(x) is computed for
each x separately, the quantum physics computes all 'compoonents' of the wave function
simulatenously and in parallel. This implies that the computation of statistics (means,
correlations) of high dimensional distribution, which requires exponentially long computation
times using classical machines, could be computed in constant time on a quantum device.
My recent work focusses on learning such quantum systems. The learning step requires the
estimation of the above statistics and is done classically using Monte Carlo sampling. The
long term aim is to replace this step by a quantum computer.
The use of the quantum formalism for learning also yields novel quantum statistics for purely
classical data analysis. These statistics signal entanglement in classical data. The research
focuses on 1) developing fast approximate inference methods for quantum learning 2) data
analysis using quantum statistics.
Sparse regression with the Garrote
Standard learning problems are to explain a dependent variable ('the output') in terms of independent variables ('the inputs').
In many learning problems, the number of input variables is large compared to the number of available data samples. Examples are found in genetics, neuro imaging and in general in many pattern recognition problems.
In order to obtain a reasonable solution in these cases, the problem needs to be regularized, typically by adding a constraint that enforces a solution with small norm.
In addition often a sparse solution is desired, which explains the output in terms of a (small) subset of the input variables.
The Lasso method is a sparse regression method that uses an L1 norm as regularizer. The Lasso is very fast and can be applied to very large problems. However, the method suffers from
'shrinkage' which means that in certain cases the wrong inputs are identified. Ideally, one would use a regularizes which penalizes the number of inputs rather than their strength. This is achieved using the so-called L0 regularizer. However, to find the solution in this case is significantly more difficult. Examples of approaches are Monte Carlo methods or the
variational garrote.
All sparse methods suffer from strongly correlated inputs. Examples are the spatial correlations between nearby genetic measurements, or pixels in images.
In this project, the student will extend the variational garrote to take these correlations into account and to demonstrate the improved performance on neuro-imaging or genetic data.
Data analyis for sustainable energy consumption
In collaboration with NRLytics, a young start-up in the energy sector, this project aims to
use machine learning methods to analyse and optimize energy consumption. See Project description (in Dutch)
Wim Wiegerinck is
senior researcher and is working on approximate inference, genetic
inference and various applications and is associate director of Smart Research bv
Giel van Bergen is a PhD student on the GenoMiX (with Kees Albers)
project. He works on the application of Bayesian learning methods for genetics and optimization of animal breeding.
Willem Burgers is senior program developer for Smart Research bv.
Eduardo Dominguez is a postdoc working on approximate inference for quantum machine
learning (start 2/2019).
Roeland Wiersema is a master student working on the quantum perceptron
Alex Kolmus is a master student working on a nano scale realisation of a Hopfield networks
(with Alex Khajetoorians and Misha Katsnelson)
Manu Compen is a master student working on efficient learning methods for the quantum
Boltzmann Machine
Jordi Riemens is a master student working on risk sensitive reinforcement learning
Yannick Lingelman is a phd student working on sensori motor control for autonomous driving
(start 4/2019)
Joris Mooij was PhD
student on approximate inference.
Kees Albers was PhD student and postdoc on approximate
inference methods for genetic linkage analysis. Kees was 4 years at Sanger
Institute, Cambridge UK and is since 2012 at Human Genetics in Nijmegen.
Bram Kasteel was Bachelor student on the topic of multi-agent control
Stijn Tonk was Master student on the topic of multi-agent control
Ender Akay was programmer for Smart Research bv and Promedas bv
Gulliver de Boer was Bachelor student on the topic of multi-agent control applied to poker
Max Bakker was a Bachelor student on the topic of multi-agent systems
Ben Ruijl was a Bachelor student on the topic of multi-agent systems
Henk Griffioen was Master student on the topic of genetic association studies
Bart van den broek was PhD student on the topic of stochastic optimal control theory
Patrick Lessmann
was PhD student on the topic of stochastic optimal control in the
CompLACS (EU FP7) project.
Elena Zavgorodnyaya was Master student on the topic of Brain
Computer Interfaces.
Martin Mittag was a master student on the topic of stochastic
optimal control theory with neural networks
Dick van den Broek was a master physics student on the topic of multi-agent systems
Bram Kasteel was a master physics student on the topic of stochastic optimal control theory
Christiaan Schoenaker was a master physics student on the topic of Super Modeling by combining imperfect models (SUMO)
Jonas Ahrendt was a Artificial Intelligence student on the topic of genetic pedigrees and Bonaparte
Joris Bukala was a bachelor physics student on the topic of Monte Carlo methods
Gulliver de Boer was a master physics student on the topic of genetic linkage analysis with Gaussian Process Regression
and implementation on GPUs
Vicenç Gómez was postdoc on approximate inference and stochastic optimal control in the
CompLACS (EU FP7) project, now at Universidad Pompeu Fabra in Barcelona
Kevin Sharp was
a postdoc on a project to develop Bayesian Gaussian process methodes for genetic association studies, now at Oxford University
Joris Bierkens
was postdoc on the topic of stochastic optimal control in the CompLACS (EU
FP7) project, now at Warwick University
Takamitsu Matsubara is assistant professor of robotics at the Nara Institue of Science
and Technology on sabatical leave in our group in 2013
Alberto Llera was PhD student on the topic of Brain Computer Interfaces, now postdoc with Christian Beckman at the Donders Center for Imaging
Satoshi Satoh is assistant professor at the
faculty of engineering of
Hiroshima University in Japan. He visited in 2011-2012 to generalize the path integral control
method and to apply this method to concrete problems in control and robotics.
Sep Thijssen was a PhD student funded by Thales Nederland and on the Complacs project, working on application of
stochastic optimal control methods for multi-agent systems. See here his very readable PhD Thesis.
Han Nauta was Master student working on path integral control problems
Hans Ruiz was a PhD student on the NETT project. He works on the application of stochastic optimal control methods in
neuroscience for multi-agent systems
Dominik
Thalmeier was a PhD student on the NETT project. He works on the application of stochastic optimal control methods in
neuroscience for multi-agent systems
Silvia Menchon is assistant professor at the University of Cordoba (Argentina), visiting in 2015-2016 funded by the Radboud Excellence Initiative.