Together with the Dept. of internal medicine of the University
Hospital in Utrecht, we have been engaged since 1996 in the design of
a medical diagnostic system for internal medicine. The approach is
based on Bayesian networks. See here for a more detailed description.
In this paper we present a graphical model for polyphonic music transcription. Our model, formulated as a Dynamical Bayesian Network, embodies a transparent and computationally tractable approach to this acoustic analysis problem. An advantage of our approach is that it places emphasis on explicitly modelling the sound generation procedure. It provides a clear framework in which both high level (cognitive) prior information on music structure can be coupled with low level (acoustic physical) information in a principled manner to perform the analysis. The model is a special case of the, generally intractable, Switching Kalman Filters. Where possible, we derive, exact polynomial time inference procedures, and otherwise efficient approximations. We argue that our generative model based approach is computationally feasible for many music applications and is readily extensible to more general auditory scene analysis scenarios.
We present a probabilistic generative model for timing deviations in expressive music performance. The structure of the proposed model is equivalent to a switching state space model. The switch variables correspond to discrete note locations as in a musical score. The continuous hidden variables denote the tempo. We formulate two well known music recognition problems, namely tempo tracking and automatic transcription (rhythm quantization) as filtering and maximum a posteriori (MAP) state estimation tasks. Exact computation of posterior features such as the MAP state is intractable in this model class, so we introduce Monte Carlo methods for integration and optimization. We compare Markov Chain Monte Carlo (MCMC) methods (such as Gibbs sampling, simulated annealing and iterative improvement) and sequential Monte Carlo methods (particle filters). Our simulation results suggest better results with sequential methods. The methods can be applied in both online and batch scenarios such as tempo tracking and transcription and are thus potentially useful in a number of music applications such as adaptive automatic accompaniment, score typesetting and music information retrieval.
Shortened version of the doctoral thesis of Tom Heskes. Overview of on-line learning in small networks with emphasis on handling local minima.
We derive cooling schedules for the global optimization of learning in neural networks. First, we will discuss a two-level system with one global and one local minimum. The analysis is extended to systems with various minima. A typical cooling schedule is of the form , with the learning parameter at time t and a constant. In some simple cases can be calculated. Simulations confirm the theoretical results.
An attempt is made to study learning in neural networks with local minima. For small learning parameter $\eta$, the transision time from one minimum to another is asymptotlically given by , with $\tilde{\eta}$ a constant The algorithm follows directly from a consideration of the statistics of the weights in the network. The characteristic behavior of the algorithm is calculated, both in a fixed and in a changing environment. A simple example, Widrow-Hoff learning for statistical classification, serves as an illustration.
One of the first papers on on-line learning. For that reason often cited. Views on-line learning as a stochastic process. Also discusses learning in a changing environment.
Derives an energy function for a variant of the Kohonen learning rule. Discusses how to apply this learning rule to nonparametric regression.