Machine Learning - autumn 2017

Course information

The course provides an advanced introduction to machine learning with emphasis on the Bayesian perspective. The course is intended for Master's students in physics as well as AI/computer science students with sufficient mathematical background. For AI/computer science students it is highly recommended to take the course Statistical machine learning prior to this course.
For physics and math students, this course is the follow-up of the bachelor course Inleiding Machine Learning

Course material: All machine learning material is summarized in these slides.pdf

Format: The course will be weekly sessions, mainly taught by me. Emphasis is on learning the material through written and computer exercises.

Presentation schedule: Note that the schedule may change during the course. Detailed breakdown of the chapters to be presented will be discussed during the course.

Exercises between brackets are important to understand and have solution in the book. They do not count towards the grade.
Week Topic Chapter MacKay/Material Weekly exercises Computer exercises (hand in before end of course)
1 36 Probability, entropy and inference Chapter 2
Exercises 2.4, 2.6 + continued, 2.7, 2.8 to be discussed in the class.

Exercises: 2.10, 2.14, 2.16ab, 2.18, 2.19, 2.26
2 37 More about inference
Model comparison and Occam's raisor
Chapter 3, Chapter 28
Exercises: 3.3, 3.4, 3.15 to be discussed in the class.
Exercises: 3.1, 3.2, 3.5, 3.6, (3.7 if you like) 3.8, 3.9
Exercises: 28.1, 28.2 only for model H_1, (28.3 if you like)
3 38 Gaussians, Gaussian mixtures, EM, Laplace method Chapter 20, 21, 22, 24, 27 21.3, 22.4, (22.5), 22.6, 22.7, 22.8a, (22.12), 22.13
3 39 Gaussians, Gaussian mixtures, EM, Laplace method Chapter 20, 21, 22, 24, 27 21.3, 22.4, (22.5), 22.6, 22.7, 22.8a, (22.12), 22.13
4 40 Monte Carlo Methods (1) 29.1-29.5 29.3
The computer simulations of 29.13, reproducing figs 29.20
29.15
5 41 Markov processes, ergodicity
Monte Carlo Methods (2), HMC
MCMC for Perceptron posterior
29.6,30.1, 30.3
38,39,41
An example of Baysian inference in perceptron learning using MCMC methods. The files (Matlabfiles and instructions) needed to do this exercise can be found here: [mcmc_mackay.tar.gz].
42 no class
6 43 Ising model 31 31.1, 31.3. Extra exercises 29,31 Exercise to compare simulated annealing with iterative improvement on Ising model see: simulated_annealing.zip
7 44 Variational inference Chapter 33 Write a computer algorithm that reproduces fig. 33.4
Exercise 33.4
8 45 Boltzmann Machines and Mean Field theory Chapter 43
handouts chapter 1-2
handouts chapter 2 exercises 1a, 2, 3 Write a computer program to implement the Boltzmann machine learning rule as given on pg. 21 of chapter 2
. Use N=10 neurons and generate random binary patterns. Use these data to compute the clamped statistics (x_i x_j)_c and (x_i)_c. Use K=200 learning steps. In each learning step use T=500 steps of sequential stochastic dynamics to compute the free statistics (x_i x_j) and (x_i). Test the convergence by plotting the size of the change in weights versus iteration.
A much more efficient learning method can be obtained by using the mean field theory and the linear response correction. Build a classifier for the MNIST data based on the Boltzmann Machine as described in 2.5.1
9 46 Supervised learning: Perceptrons and Multi-Layered Perceptrons handouts chapter 3 (HKP 5 and 6)
handouts chapter 3, Ex. 2,3
  • Write a computer program that implements the perceptron learning rule. Take as data p random input vectors of dimension n with binary components. Take as outputs random assignments \pm 1. Take n=50 and test empirically that when p < 2 n the rule converges almost always and for p > 2n the rule converges almost never.
  • Reconstruct the curve C(p,n) as a function of p for n=50 in the following way. For each p construct a number of learning problems randomly and compute the fraction of these problems for which the perceptron learning rule converges. Plot this fraction versus p.
Write a multi-layered perceptron learning algorithm to classify two of the MNIST classes (for instance the 3's against the 7s). Optimize the architecture by varying the number of hidden units and hidden layers. Test different learning methods, such as naive gradient descent, momentum and conjugate gradient descent. Assess the quality of the solution with results reported in the literature.
10 47 MLPs, Deep networks
Sparse regression, Lasso
Sparse regression computer exercise
  • Derive the sequential Gauss-Seidel update rule from slide 204 Eq. 1.
  • Write your own Lasso method using coordinate descent.
  • Test your algorithm on data set 1 lasso data Reproduce a figure similar to slide 208. Find the optimal value of gamma by cross validation. Compare the Lasso result with ridge regression (with optimized ridge regression parameter found by cross validation).
  • Consider the example of correlated inputs on slide 212. Reproduce these results with your software using data generated by correlated_data.m. Compute the input output correlations b_i and use this to explain the observed phenomenon.
Write a brief report on your findings and include your source code.
11 48 Sparse regression.
Variational Garrote
12 49 Discrete time control
dynamic programming
Bellman equation
Bertsekas 2-5, 13-14, 18, 21-32 (2nd ed.)
Bertsekas 2-5, 10-12, 16-27, 30-32 (1nd ed.)
Kappen ICML tutorial 1.2
Ex: Carry out the calculations needed to verify that J0(1)=2.7 and J0(2)=2.818 in Bertsekas Example 3.2 on pg. 23 in Copies 1b
extra exercise 1, 2a,b
13 50 Continuous time control
Hamilton-Jacobi-Bellman Equation
Pontryagin Minimum Principle
Stochastic differential equations
Stochastic optimal control
LQ examples, Portfolio management
Kappen ICML tutorial 1.3,1.4
extra exercise 2c,3
14 51 Path integral control theory
Kappen ICML tutorial 1.6
Thijssen, Kappen
Kappen, Ruiz
extra exercise 4 and 5
15 2 Overview of research at SNN Machine learning
16 3 Presentation computer exercises
  • Ising model:
  • Boltzmann Machines:
  • MLP:
  • sparse regression
  • control theory


Examination:
There will be no final examination. The grade will be based on take home computer exercises

During one of the last lectures you will present your solution for one of these exercises. Hand in the code that can be run stand-alone. In addition write a report for each exercise See handleiding verslag (in Dutch) All should be handed in before the end of January 2018.