Use this URL to cite or link to this record in EThOS:
Title: A Bayesian expected error reduction approach to Active Learning
Author: Fredlund, Richard
Awarding Body: University of Exeter
Current Institution: University of Exeter
Date of Award: 2011
Availability of Full Text:
Access from EThOS:
Access from Institution:
There has been growing recent interest in the field of active learning for binary classification. This thesis develops a Bayesian approach to active learning which aims to minimise the objective function on which the learner is evaluated, namely the expected misclassification cost. We call this approach the expected cost reduction approach to active learning. In this form of active learning queries are selected by performing a `lookahead' to evaluate the associated expected misclassification cost. \paragraph{} Firstly, we introduce the concept of a \textit{query density} to explicitly model how new data is sampled. An expected cost reduction framework for active learning is then developed which allows the learner to sample data according to arbitrary query densities. The model makes no assumption of independence between queries, instead updating model parameters on the basis of both which observations were made \textsl{and} how they were sampled. This approach is demonstrated on the probabilistic high-low game which is a non-separable extension of the high-low game presented by \cite{Seung_etal1993}. The results indicate that the Bayes expected cost reduction approach performs significantly better than passive learning even when there is considerable overlap between the class distributions, covering $30\%$ of input space. For the probabilistic high-low game however narrow queries appear to consistently outperform wide queries. We therefore conclude the first part of the thesis by investigating whether or not this is always the case, demonstrating examples where sampling broadly is favourable to a single input query. \paragraph{} Secondly, we explore the Bayesian expected cost reduction approach to active learning within the pool-based setting. This is where learning is limited to a finite pool of unlabelled observations from which the learner may select observations to be queried for class-labels. Our implementation of this approach uses Gaussian process classification with the expectation propagation approximation to make the necessary inferences. The implementation is demonstrated on six benchmark data sets and again demonstrates superior performance to passive learning.
Supervisor: Fieldsend, Jonathan ; Everson, Richard Sponsor: EPSRC
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available
Keywords: Active Learning ; Bayes ; Expected Error Reduction ; machine learning ; binary classification