Use this URL to cite or link to this record in EThOS:
Title: Integration and biological interpretation of microarry gene expression profiling data
Author: Menon, Suraj
ISNI:       0000 0004 2749 2068
Awarding Body: Cardiff University
Current Institution: Cardiff University
Date of Award: 2009
Availability of Full Text:
Access from EThOS:
Access from Institution:
Many different strategies have been developed for the analysis of microarray data and these have a significant influence on the level and quality of knowledge that may be achieved from a microarray-based experiment. Two such strategies are explored in this thesis. Part A of this thesis describes explorations of a resource-efficient strategy that could allow for large-scale integration of microarray data in an unsupervised fashion. For this purpose, comparisons were carried out between a series of genelists manually extracted from the literature, representing a disparate set of microarray experiments. Initial results were highly unexpected, and are likely to have been caused by violations of the assumptions of the hypergeometric test used for assessing comparisons. Statistical modelling was found to successfully simulate these results however the estimated net effect of these violations was found to be considerable. These findings strongly caution against the comparison of microarray experiments using their genelists. Part B then describes the development of Gene Set Discovery (GSD), a novel methodology to perform threshold-free gene set analysis of microarray datasets without requiring sample class information. This was achieved by deriving a novel metric that allows for the selection of those gene sets that exhibit significant discrimination between samples. GSD was implemented on four microarray datasets and the results were found to be biologically plausible and/or in agreement with prior analyses of these datasets. These findings suggested that GSD could be a potentially useful tool for biological theme discovery in microarray datasets, particularly in studies of cancer where sample classification is problematic. Also described is a related methodology for extraction of informative genes from within selected gene sets, and a scheme for visualization of results in an integrated format.
Supervisor: Not available Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available