Bayesian Methods for Adaptive Models

Abstract

The Bayesian framework for model comparison and regularisation is demonstrated by studying interpolation and classification problems modelled with both linear and non-linear models. This framework quantitatively embodies ‘Occam’s razor’. Over-complex and under-regularised models are automatically inferred to be less probable, even though their flexibility allows them to fit the data better. When applied to ’neural networks’, the Bayesian framework makes possible (1) objective comparison of solutions using alternative network architectures; (2) objective stopping rules for network pruning or growing procedures; (3) objective choice of type of weight decay terms (or regularisers); (4) on-line techniques for optimising weight decay (or regularisation constant) magnitude; (5) a measure of the effective number of well-determined parameters in a model; (6) quantified estimates of the error bars on network parameters and on network output. In the case of classification models, it is shown that the careful incorporation of error bar information into a classifier’s predictions yields improved performance. Comparisons of the inferences of the Bayesian Framework with more traditional cross-validation methods help detect poor underlying assumptions in learning models. The relationship of the Bayesian learning framework to ‘active learning’ is examined. Objective functions are discussed which measure the expected informativeness data measurements, in the context of both interpolation and classification problems. The concepts and methods described in this thesis are quite general and will be applicable to other data modelling problems whether they involve regression, classification or density estimation.

Type