Use this URL to cite or link to this record in EThOS:
Title: Diversity, margins and non-stationary learning
Author: Stapenhurst, Richard John
ISNI:       0000 0004 2736 797X
Awarding Body: University of Manchester
Current Institution: University of Manchester
Date of Award: 2012
Availability of Full Text:
Access from EThOS:
Access from Institution:
Ensemble methods are frequently applied to classification problems, and generally improve upon the performance of individual models. Diversity is considered to be an important factor in this performance improvement; in the literature there is strong support for the idea that high diversity is crucial in ensembles. Voting margins provide an alternative explanation of the behaviour of ensembles; they have been prominently used in the interpretation of the Adaboost algorithm, and the literature suggests that large margins are beneficial. In this thesis, we examine these two quantities — which in both cases the literature suggests should be increased — and show that (in 2-class problems) they are inversely related, high diversity corresponding to small absolute margins. From this it can be seen that the views expressed in the literature are contradictory; we argue that ensemble behaviour can be sufficiently understood without the need to quantify ‘diversity’. However, in non-stationary learning scenarios — where we must process data that is not independent and identically distributed — the model must not only generalise well, but also adapt to changes in the distribution. Building on the work of Minku, we hypothesise that high diversity might be of special significance in such problems in determining the rate at which the model can adapt. We use the correspondence between diversity and margins to formulate the reasoning behind this intuition formally, and then derive an algorithm that explicitly manages diversity in order to test this hypothesis. An empirical investigation shows that managing diversity can, under certain conditions, improve the ability of an ensemble to adapt to a new concept; however, it typically seems that other aspects of the learning algorithm, especially concept change detection, have a substantially larger impact on performance than diversity does.
Supervisor: Brown, Gavin Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available