Bert Kappen

Bert (HJ) Kappen

Bert (HJ) Kappen is professor of physics at the Department of Biophysics, Radboud University, Nijmegen

Address: Department of Neurophysics, Donders Center for Neuroscience, Heyendaalseweg 135, 6525 AJ Nijmegen, The Netherlands
Email: b.kappen@science.ru.nl

The physics of machine learning

Keywords: Bayesian inference, learning and reasoning, stochastic control theory, neural networks, statistical physics, quantum machine learning

My research focusess on the design of efficient computational methods AI and machine learning using ideas and methods from statistical physics and quantum physics. In addition, I am interested in how intelligence and consciousness may arise in the brain. Here, I give a high level overview of my research interests, both past and present. For details consult my Google scholar page.

Bayesian Inference

Due to the essential roles that noise and uncertainty play in perception and learning, a useful way to model intelligence is to use probability models. In the mid 90s, the fields of analog and digital computing as separate approaches to model intelligence, have begun to merge using the idea of Bayesian inference: One can generalize the logic of digital computation to a probabilistic calculus, embodied in a so-called graphical model. Similarly, one can generalize dynamical systems to stochastic dynamical systems that allow for a probabilistic description in terms of a Markov process. The Bayesian paradigm has greatly helped to integrate different schools of thought in particular in the field of artificial intelligence and machine learning but also provides a computational paradigm for neuroscience. A typical Bayesian computation, whether in the context of a complex data analysis problem or in a stochastic neural network, is to compute an expectation value, which is referred to as Bayesian inference. Bayesian inference is intractable, which means that computation time and memory use scale exponentially with the problem size. However, many methods exist to compute these quantities approximately. Most of these methods origin from statistical physics, such as the mean field method, belief propagation or Monte Carlo sampling. Application of these methods to machine learning problems is challenging and an active field of research to which I have made several contributions.

Control theory

Control theory is a theory from engineering that gives a formal description of how a system, such as a robot or animal, can move from a current state to a future state at minimal cost, where cost can mean time spent, or energy spent or any other quantity. Control theory is used traditionally to control industrial plants, airplanes or missiles, but is also the natural framework to model intelligent behavior in animals or robots. The mathematical formulation of deterministic control theory is very similar to classical mechanics. In fact, classical mechanics can be viewed as a special case of control theory. Stochastic control theory uses the language of stochastic differential equations. For a certain class of stochastic control problems, the solution is described by a linear partial differential equation that can be solved formally as a path integral. This so-called path integral control method provides a deep link between control, inference and statistical physics. This statistical physics view of control theory shows that qualitative different control solutions exist for different noise levels separated by phase transitions. The control solutions can be computed using efficient approximate inference methods such as Monte Carlo sampling or deterministic approximation methods. The path integral control theory is successfully being used by leading research groups in robotics world wide. For more information see the path integral control theory page.

Quantum Machine Learning

Current successes in machine learning has ignited interesting new connections between machine learning and quantum physics, loosely referred to as quantum machine learning. Machine learning methods are finding useful applications in quantum physics, such as characterizing the ground state of a quantum Hamiltonian or to learn different phases of matter. Since 2018, I am interested in how the quantum formalism can be used to advance machine learning. Recent work:

Quantum Boltzmann Machine One line of work is the quantum Boltzmann machine, which is a method to learn a quantum Hamiltonian from classical or quantum data. The learning rule requires the computation of quantum spin expectation values which is intractable, as in the classical case article. In this paper we propose a new method to accellerate learning using a quantum circuit article.
Adiabatic quantum computing Another line of work is to explore the possibility of quantum advantage using adiabatic quantum computing. In particular, we generalize the well-known quadratic speedup of Grover search to general optimization problems. We show that this is in principle possible, but that in practice this faces two serious obstacles. The speedup is achievable using an optimized annealing schedule that requires the exact value of annealing parameter at the phase transition. Computation of this number is intractable in general. Secondly, the value needs to be specified with with a numerical precision that increases exponentially with the problem size article.

Computing with atoms

Since 2020 we have an intense new collaboration with the scanning tunneling microscopy group of professor Alex Khajetoorians. In this collaboration, we have shown the possibility to realize a stochastic neural network at atomic scale. The spins in this network are bi-stable atoms that stochastically switch between two states (up and down). Each spin or neuron is characterized by the asymmetry (the probability to be in the up state minus down state) and mean residence time (the mean time between switches). Residence times of different spins can differ many orders of magnitude. We proposed that fast spins encode the firing or non-firing of neurons and slow spins encode binary learning elements, ie. synapses. In this way, a physical substrate can implement learning as the long term change of the slow variables. article.

Identification of missing persons through DNA

We built in 2010 a software system, called Bonaparte, for the identification of missing persons on the basis of their DNA. The method matches individuals DNA to pedigrees of relatives using a Bayesian network. The method is currently used by the NFI, the Australian police force on the entire continent, the Interpol I-Familia system and the identification of victims from the Spanish Civil War . See Bonaparte for further details.

Medical diagnosis

The idea to assist medical doctors to diagonse patients based on their symptoms is one of the oldest ideas of the use of artificial intelligence. However, up to today, building such systems with high accuracy has proven surprisingly difficult. In collaboration with the Erasmus MC in Rotterdam, we are building such an expert system for the diagnosis of internal medicine related diseases as they occur in the emergency department. The system is based on a Bayesian network that is specified on the basis of the knowledge of medical experts and textbooks.

Teaching

Inleiding Machine Learning BA NWI physics and math

Statistical Machine Learning MA NWI computer science and FSS AI

Advanced computational Neuroscience MA NWI physics and math and MA Donders research

Machine Learning MA NWI physics and math and MA Donders research

Advanced machine Learning MA NWI physics and math and MA Donders research

Short course on control theory, ACNS

Short course on machine learning, Pompeu Fabra spring 2003

Short course on control theory, Madrid fall 2010

Short course on control theory, UCL 2011

Short course on control theory, Madrid 2012

Tutorial ICTP Summer school Machine Learning, Trieste 2012

Short course on control theory, Madrid 2013

Tutorial ICTP Spring College on Physics of Complex systems, Trieste May-June 2013

Short course on control theory, Madrid 2014

Information, Physics, and Computation MA 2014

Introduction to Biophysics BA (discontinued)

Neural networks and Information theory BA (discontinued, content moved to Inleiding Machine Learning BA from 2012-2013)

Neurophysics BA (discontinued, content moved to Computational Neuroscience MA from 2013-2014)

Neurophysics MA (discontinued)

Computational Physics MA (discontinued)

Master Projects

Quantum machine learning

There is an exciting possibility to use a quantum mechanical wave function to represent a probability distribution. While classically the probability distribution p(x) is computed for each x separately, the quantum physics computes all 'compoonents' of the wave function simulatenously and in parallel. This implies that the computation of statistics (means, correlations) of high dimensional distribution, which requires exponentially long computation times using classical machines, could be computed in constant time on a quantum device. My recent work focusses on learning such quantum systems. The learning step requires the estimation of the above statistics and is done classically using Monte Carlo sampling. The long term aim is to replace this step by a quantum computer. The use of the quantum formalism for learning also yields novel quantum statistics for purely classical data analysis. These statistics signal entanglement in classical data. The research focuses on 1) developing fast approximate inference methods for quantum learning 2) data analysis using quantum statistics.

Sparse regression with the Garrote

Standard learning problems are to explain a dependent variable ('the output') in terms of independent variables ('the inputs'). In many learning problems, the number of input variables is large compared to the number of available data samples. Examples are found in genetics, neuro imaging and in general in many pattern recognition problems. In order to obtain a reasonable solution in these cases, the problem needs to be regularized, typically by adding a constraint that enforces a solution with small norm. In addition often a sparse solution is desired, which explains the output in terms of a (small) subset of the input variables. The Lasso method is a sparse regression method that uses an L1 norm as regularizer. The Lasso is very fast and can be applied to very large problems. However, the method suffers from 'shrinkage' which means that in certain cases the wrong inputs are identified. Ideally, one would use a regularizes which penalizes the number of inputs rather than their strength. This is achieved using the so-called L0 regularizer. However, to find the solution in this case is significantly more difficult. Examples of approaches are Monte Carlo methods or the variational garrote. All sparse methods suffer from strongly correlated inputs. Examples are the spatial correlations between nearby genetic measurements, or pixels in images. In this project, the student will extend the variational garrote to take these correlations into account and to demonstrate the improved performance on neuro-imaging or genetic data.

Data analyis for sustainable energy consumption

In collaboration with NRLytics, a young start-up in the energy sector, this project aims to use machine learning methods to analyse and optimize energy consumption. See Project description (in Dutch)

Contact information

SNN Machine Learning

Radboud University

Huygensgebouw 00.829

Heyendaalseweg 135

NL 6525 AJ Nijmegen

The Netherlands

+31 24 3614241 (phone)

b.kappen-at-science-dot-ru-dot-nl

Recent meetings organized

NIPS Workshop Probabilistic optimal control, Whistler 2009

SNN Symposium Intelligent Machines, Nijmegen 2010

School on large scale problems in machine learning and workshop on common concepts in machine learning and statistical physics, ICTP Trieste 2012

Workshop on the statistical physics of inference and control theory, Granada 2012 website and videolectures

LSOLDM 2013

NIPS 2013 Workshop Planning with Information Constraints for control, reinforcement learning, computational neuroscience, robotics and games

LSOLDM 2014

Intelligent Machines 2015

Organizations

Foundation for Neural Networks (SNN)

Visiting professor at Gatsby Computational Neuroscience Unit, University College London

Current group members

Wim Wiegerinck is senior researcher and is working on approximate inference, genetic inference and various applications and is associate director of Smart Research bv

Giel van Bergen is a PhD student on the GenoMiX (with Kees Albers) project. He works on the application of Bayesian learning methods for genetics and optimization of animal breeding.

Willem Burgers is senior program developer for Smart Research bv.

Eduardo Dominguez is a postdoc working on approximate inference for quantum machine learning (start 2/2019).

Roeland Wiersema is a master student working on the quantum perceptron

Alex Kolmus is a master student working on a nano scale realisation of a Hopfield networks (with Alex Khajetoorians and Misha Katsnelson)

Manu Compen is a master student working on efficient learning methods for the quantum Boltzmann Machine

Jordi Riemens is a master student working on risk sensitive reinforcement learning

Yannick Lingelman is a phd student working on sensori motor control for autonomous driving (start 4/2019)

Former group members

Tom Heskes (Radboud University Nijmegen) was PhD student and postdoc on online learning.

Martijn Leisink (D66) was PhD student and postdoc on approximate inference

Taylan Cemgil (Bogazici University, Turkey) was PhD student on time-series modeling of music

Joris Mooij was PhD student on approximate inference.

Kees Albers was PhD student and postdoc on approximate inference methods for genetic linkage analysis. Kees was 4 years at Sanger Institute, Cambridge UK and is since 2012 at Human Genetics in Nijmegen.

Bram Kasteel was Bachelor student on the topic of multi-agent control

Stijn Tonk was Master student on the topic of multi-agent control

Ender Akay was programmer for Smart Research bv and Promedas bv

Gulliver de Boer was Bachelor student on the topic of multi-agent control applied to poker

Max Bakker was a Bachelor student on the topic of multi-agent systems

Ben Ruijl was a Bachelor student on the topic of multi-agent systems

Henk Griffioen was Master student on the topic of genetic association studies

Bart van den broek was PhD student on the topic of stochastic optimal control theory

Mohammad Azar was PhD student on the topic of reinforcement learning, now at Carnegie Mellon University

Patrick Lessmann was PhD student on the topic of stochastic optimal control in the CompLACS (EU FP7) project.

Elena Zavgorodnyaya was Master student on the topic of Brain Computer Interfaces.

Martin Mittag was a master student on the topic of stochastic optimal control theory with neural networks

Dick van den Broek was a master physics student on the topic of multi-agent systems

Bram Kasteel was a master physics student on the topic of stochastic optimal control theory

Christiaan Schoenaker was a master physics student on the topic of Super Modeling by combining imperfect models (SUMO)

Jonas Ahrendt was a Artificial Intelligence student on the topic of genetic pedigrees and Bonaparte

Joris Bukala was a bachelor physics student on the topic of Monte Carlo methods

Gulliver de Boer was a master physics student on the topic of genetic linkage analysis with Gaussian Process Regression and implementation on GPUs

Vicenç Gómez was postdoc on approximate inference and stochastic optimal control in the CompLACS (EU FP7) project, now at Universidad Pompeu Fabra in Barcelona

Kevin Sharp was a postdoc on a project to develop Bayesian Gaussian process methodes for genetic association studies, now at Oxford University

Joris Bierkens was postdoc on the topic of stochastic optimal control in the CompLACS (EU FP7) project, now at Warwick University

Takamitsu Matsubara is assistant professor of robotics at the Nara Institue of Science and Technology on sabatical leave in our group in 2013

Alberto Llera was PhD student on the topic of Brain Computer Interfaces, now postdoc with Christian Beckman at the Donders Center for Imaging

Satoshi Satoh is assistant professor at the faculty of engineering of Hiroshima University in Japan. He visited in 2011-2012 to generalize the path integral control method and to apply this method to concrete problems in control and robotics.

Sep Thijssen was a PhD student funded by Thales Nederland and on the Complacs project, working on application of stochastic optimal control methods for multi-agent systems. See here his very readable PhD Thesis.

Han Nauta was Master student working on path integral control problems

Hans Ruiz was a PhD student on the NETT project. He works on the application of stochastic optimal control methods in neuroscience for multi-agent systems

Dominik Thalmeier was a PhD student on the NETT project. He works on the application of stochastic optimal control methods in neuroscience for multi-agent systems

Silvia Menchon is assistant professor at the University of Cordoba (Argentina), visiting in 2015-2016 funded by the Radboud Excellence Initiative.

Externally funded projects

BrainGain (Smart mix) (ended)

Genetic association study using machine learning methods (Donders internal round) (ended)

Brain imaging, genetics and psychiatry using machine learning methods (NWO Cognition) (ended)

Multi-agent systems (with Thales D-Cis lab) (ended)

Bovinose (Smart Research, EU FP7) (ended)

CompLACS (EU FP7) (ended)

SUMO (EU FP7) (ended)

NETT (ended)

Genomix. Using neural networks to improve animal breeding (with U Wageningen, funded by STW)

Decentralized UAV control (with TU Delft Micro Air Vehicle laboratory, funded by NWO)

Learning and control of next generation deep neural technologies (with Riccardo Zecchina at Bocconi University, funded by ONR)

Quantum Learning (funded by NWA)

Vacancies

There are currently no vacancies

Some pictures