Use this URL to cite or link to this record in EThOS: http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.606130
Title: Quantitative analysis of time-series microarray data, with application to investigating responses to environmental stresses in Arabidopsis
Author: Law, Philip John
Awarding Body: University of Warwick
Current Institution: University of Warwick
Date of Award: 2013
Availability of Full Text:
Access through EThOS:
Access through Institution:
Abstract:
High-throughput technologies have made it possible to perform genome-scale analyses to investigate a variety of research areas. From these analyses, vast amounts of potentially noisy data is generated which could obscure the underlying signal. In this thesis, a high-throughput regression analysis approach was developed, where a variety of linear and nonlinear models were fitted to gene expression profiles from time course experiments. These models included the logistic, Gompertz, exponential, critical exponential, linear+exponential, Gaussian, and hyperbolic functions. The fitted parameters from these models reflect aspects of the model shape, and are thus biologically interpretable. Investigating the fitted parameters allowed for the interpretation of the gene expression profiles in terms of the underlying biology, such as the time of initial expression. This provides a potentially more mechanistic approach to study the genetic responses to stimuli. This analysis was applied to three time series gene expression experiments - a Saccharomyces cerevisiae time course as a validation of the method, and two time course experiments on Arabidopsis thaliana investigating stress responses to the senescence process, and pathogen infection by Botrytis cinerea. A cluster analysis, named ShapeCluster, was developed as an application of the fitted models. Using this analysis, it was possible to cluster on aspects of the shape of the expression profiles using different combinations of parameters. This added flexibility to the analysis and allowed for the investigation of the data in multiple ways. Specifically, performing the cluster analysis on a specific parameter permitted the identification of genes that are co-regulated, or participate in response to the biological stress in question. Several methods of producing clusters with combinations of parameters, namely simultaneous parameter clustering, sequential meta-clustering, and cross meta-clustering, provided additional means of interrogating the data. Clusters from these methods were assessed for significance through the use of over-represented annotation terms and motifs, and found to produce biologically relevant sets of genes. Experiments using quantitative-PCR and luciferase transcriptional reporters were designed to determine the response to a combined Botrytis and senescence stress. A predicted model was identified by fitting a factor model to the experimental data, and identifying the most significant model effects. This model removed noise from the biological data, and confirmed that the effects of the two stresses was additive. In cross-sectional data, each sample is obtained from separate individuals (plants), and thus may be different biological ages. An iterative, cross-validation multivariate regression approach was developed, termed time shifting, to estimate the true biological age of the replicate samples, and it was shown that the approach resulted in better model fits for a large proportion of the genes. In this thesis, a number of novel analytical approaches for obtaining information from gene expression microarray datasets were developed. These analyses provided biologically oriented descriptions of individual gene expression profiles, allowing for the modelling and greater interpretation of profiles obtained from time-series experiments. Through careful choice of appropriate models, such statistical regression approaches allow for an improved comparison of gene expression profiles, and may provide an improved understanding of common regulatory mechanisms between genes.
Supervisor: Not available Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID: uk.bl.ethos.606130  DOI: Not available
Keywords: QK Botany ; QP Physiology
Share: