Use this URL to cite or link to this record in EThOS:
Title: Data-independent vs. data-dependent dimension reduction for pattern recognition in high dimensional spaces
Author: Hassan, Tahir Mohammed
ISNI:       0000 0004 6425 5045
Awarding Body: University of Buckingham
Current Institution: University of Buckingham
Date of Award: 2017
Availability of Full Text:
Access from EThOS:
Access from Institution:
There has been a rapid emergence of new pattern recognition/classification techniques in a variety of real world applications over the last few decades. In most of the pattern recognition/classification applications, the pattern of interest is modelled by a data vector/array of very high dimension. The main challenges in such applications are related to the efficiency of retrieval, analysis, and verifying/classifying the pattern/object of interest. The “Curse of Dimension” is a reference to these challenges and is commonly addressed by Dimension Reduction (DR) techniques. Several DR techniques has been developed and implemented in a variety of applications. The most common DR schemes are dependent on a dataset of “typical samples” (e.g. the Principal Component Analysis (PCA), and Linear Discriminant Analysis (LDA)). However, data-independent DR schemes (e.g. Discrete Wavelet Transform (DWT), and Random Projections (RP)) are becoming more desirable due to lack of density ratio of samples to dimension. In this thesis, we critically review both types of techniques, and highlight advantages and disadvantages in terms of efficiency and impact on recognition accuracy. We shall study the theoretical justification for the existence of DR transforms that preserve, within tolerable error, distances between would be feature vectors modelling objects of interest. We observe that data-dependent DRs do not specifically attempts to preserve distances, and the problems of overfitting and biasness are consequences of low density ratio of samples to dimension. Accordingly, the focus of our investigations is more on data-independent DR schemes and in particular on the different ways of generating RPs as an efficient DR tool. RPs suitable for pattern recognition applications are only restricted by a lower bound on the reduced dimension that depends on the tolerable error. Besides, the known RPs that are generated in accordance to some probability distributions, we investigate and test the performance of differently constructed over-complete Hadamard mxn (m<
Supervisor: Not available Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available
Keywords: QA75 Electronic computers. Computer science