This page is now out-of-date and you should have been redirected. If for some reason you can read this, please point your browser at my new page at www.miketipping.com. Thanks!

Sparse Bayesian Learning and the Relevance Vector Machine

Introduction

Welcome! This is Mike Tipping's page devoted to "Sparse Bayesian Learning", which is the not-particularly-catchy phrase to describe the application of Bayesian automatic relevance determination (ARD) concepts to models linear in their parameters. The motivation behind the approach is that one can infer an regression or classification model which is accurate and at the same time makes its predictions using only a small number of relevant basis functions which are automatically selected from a potentially large initial set.

The "relevance vector machine" is a special case of this idea, applied to linear kernel models, and may be of interest due to similarity of form with the popular "support vector machine".

This page is continually in the process of being improved ... and to prove it, I last updated it on July 5, 2005.

Current Highlight:

An introductory paper on Bayesian inference is available:

bullet Bayesian inference: An introduction to principles and practice in machine learning. In O. Bousquet, U. von Luxburg, and G. Rätsch (Eds.), Advanced Lectures on Machine Learning, pp.  41–62. Springer. [Abstract] [gzipped PostScript]

Papers

A comprehensive paper on sparse Bayesian learning from the Journal of Machine Learning Research:

bullet Sparse Bayesian learning and the relevance vector machine. Journal of Machine Learning Research  1, 211–244. [Abstract] [Available from JMLR]

There are a couple of (very minor) typos in the above [corrections here]

Two early conference publications on the Relevance Vector Machine:

bullet The Relevance Vector Machine. In S. A. Solla, T. K. Leen, and K.-R. Müller (Eds.), Advances in Neural Information Processing Systems 12, pp.  652–658. Cambridge, Mass: MIT Press. [Abstract] [gzipped PostScript]
bullet Variational relevance vector machines. In C. Boutilier and M. Goldszmidt (Eds.), Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence, pp.  46–53. Morgan Kaufmann. [Abstract] [PDF] [gzipped PostScript]

Note that the "variational" relevance vector machine is pretty much identical to the original, but is a lot slower to train J

Exploiting the sparse Bayes methodology to realise "sparse kernel PCA":

bullet Sparse kernel principal component analysis. In Advances in Neural Information Processing Systems 13. MIT Press. [Abstract] [gzipped PostScript]

Robust sparse Bayesian regression:

bullet A variational approach to robust regression. In G. Dorffner, H. Bischof, and K. Hornik (Eds.), Proceedings of ICANN'01, pp.  95–102. Springer. [Abstract] [gzipped PostScript]

Some theoretical analysis of marginal likelihood optimisation and sparsity:

bullet Analysis of sparse Bayesian learning. In T. G. Dietterich, S. Becker, and Z. Ghahramani (Eds.), Advances in Neural Information Processing Systems 14, pp.  383–389. MIT Press. [Abstract] [gzipped PostScript]

An accelerated learning algorithm:

bullet Fast marginal likelihood maximisation for sparse Bayesian models. In C. M. Bishop and B. J. Frey (Eds.), Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics, Key West, FL, Jan 3-6. [Abstract] [gzipped PostScript]

This is the algorithm of choice for implementing a sparse Bayes model. Its an order of magnitude faster than the original, uses less memory and analytically (rather than numerically) "prunes" irrelevant basis functions.

Slides

Copies of the slides from my 2003 lectures at the Tübingen "Machine Learning Summer School" are now available in ".ps.gz" format:

bullet Introduction to Bayesian Inference [180 KB]
bullet Bayesian Inference: Marginalisation [147 KB]
bullet Sparse Bayesian Models and the "Relevance Vector Machine" [1.18 MB]
bullet Sparse Bayesian Models: Analysis, Optimisation and Applications [2.68 MB]

 

Code: a Matlab implementation of "SparseBayes V1.0" is available

Some simple Matlab code to implement sparse Bayesian regression and classification models (e.g. like the RVM) is now available: [gzipped tar file]

This code implements the "original" algorithm, and so is relatively slow.

I expect to release "SparseBayes V2.0", some much-improved new code to implement the "fast" algorithm, by the end of October 2005.