Use this URL to cite or link to this record in EThOS:
Title: Feature extraction in classification
Author: Liu, Raymond
ISNI:       0000 0005 0731 8190
Awarding Body: Imperial College London
Current Institution: Imperial College London
Date of Award: 2014
Availability of Full Text:
Access from EThOS:
Access from Institution:
Feature extraction, or dimensionality reduction, is an essential part of many machine learning applications. The necessity for feature extraction stems from the curse of dimensionality and the high computational cost of manipulating high-dimensional data. In this thesis we focus on feature extraction for classification. There are several approaches, and we will focus on two such: the increasingly popular information-theoretic approach, and the classical distance-based, or variance-based approach. Current algorithms for information-theoretic feature extraction are usually iterative. In contrast, PCA and LDA are popular examples of feature extraction techniques that can be solved by eigendecomposition, and do not require an iterative procedure. We study the behaviour of an example of iterative algorithm that maximises Kapur's quadratic mutual information by gradient ascent, and propose a new estimate of mutual information that can be maximised by closed-form eigendecomposition. This new technique is more computationally efficient than iterative algorithms, and its behaviour is more reliable and predictable than gradient ascent. Using a general framework of eigendecomposition-based feature extraction, we show a connection between information-theoretic and distance-based feature extraction. Using the distance-based approach, we study the effects of high input dimensionality and over-fitting on feature extraction, and propose a family of eigendecomposition-based algorithms that can solve this problem. We investigate the relationship between class-discrimination and over-fitting, and show why the advantages of information-theoretic feature extraction become less relevant in high-dimensional spaces.
Supervisor: Gillies, Duncan Sponsor: Engineering and Physical Sciences Research Council ; Imperial College London
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral