Use this URL to cite or link to this record in EThOS:
Title: Feature selection in computational biology
Author: Athanasakis, D.
ISNI:       0000 0004 5358 0512
Awarding Body: University College London (University of London)
Current Institution: University College London (University of London)
Date of Award: 2014
Availability of Full Text:
Access from EThOS:
Full text unavailable from EThOS. Please try the link below.
Access from Institution:
This thesis concerns feature selection, with a particular emphasis on the computational biology domain and the possibility of non-linear interaction between features. Towards this it establishes a two-step approach, where the first step is feature selection, followed by the learning of a kernel machine in this reduced representation. Optimization of kernel target alignment is proposed as a model selection criterion and its properties are established for a number of feature selection algorithms, including some novel variants of stability selection. The thesis further studies greedy and stochastic approaches for optimizing alignment, propos- ing a fast stochastic method with substantial probabilistic guarantees. The proposed stochastic method compares favorably to its deterministic counterparts in terms of computational complexity and resulting accuracy. The characteristics of this stochastic proposal in terms of computational complexity and applicabil- ity to multi-class problems make it invaluable to a deep learning architecture which we propose. Very encouraging results of this architecture in a recent challenge dataset further justify this approach, with good further results on a signal peptide cleavage prediction task. These proposals are evaluated in terms of generalization accuracy, interpretability and numerical stability of the models, and speed on a number of real datasets arising from infectious disease bioinfor- matics, with encouraging results.
Supervisor: Not available Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available