Use this URL to cite or link to this record in EThOS:
Title: Reconstructing regulatory networks from high-throughput post-genomic data using MCMC methods
Author: Sharma, Sapna
Awarding Body: University of Warwick
Current Institution: University of Warwick
Date of Award: 2013
Availability of Full Text:
Access from EThOS:
Access from Institution:
Modern biological research aims to understand when genes are expressed and how certain genes in uence the expression of other genes. For organizing and visualizing gene expression activity gene regulatory networks are used. The architecture of these networks holds great importance, as they enable us to identify inconsistencies between hypotheses and observations, and to predict the behavior of biological processes in yet untested conditions. Data from gene expression measurements are used to construct gene regulatory networks. Along with the advance of high-throughput technologies for measuring gene expression statistical methods to predict regulatory networks have also been evolving. This thesis presents a computational framework based on a Bayesian modeling technique using state space models (SSM) for the inference of gene regulatory networks from time-series measurements. A linear SSM consists of observation and hidden state equations. The hidden variables can unfold effects that cannot be directly measured in an experiment, such as missing gene expression. We have used a Bayesian MCMC approach based on Gibbs sampling for the inference of parameters. However the task of determining the dimension of the hidden state space variables remains crucial for the accuracy of network inference. For this we have used the Bayesian evidence (or marginal likelihood) as a yardstick. In addition, the Bayesian approach also provides the possibility of incorporating prior information, based on literature knowledge. We compare marginal likelihoods calculated from the Gibbs sampler output to the lower bound calculated by a variational approximation. Before using the algorithm for the analysis of real biological experimental datasets we perform validation tests using numerical experiments based on simulated time series datasets generated by in-silico networks. The robustness of our algorithm can be measured by its ability to recapture the input data and generating networks using the inferred parameters. Our developed algorithm, GBSSM, was used to infer a gene network using E. coli data sets from the different stress conditions of temperature shift and acid stress. The resulting model for the gene expression response under temperature shift captures the e�ects of global transcription factors, such as fnr that control the regulation of hundreds of other genes. Interestingly, we also observe the stress-inducible membrane protein OsmC regulating transcriptional activity involved in the adaptation mechanism under both temperature shift and acid stress conditions. In the case of acid stress, integration of metabolomic and transcriptome data suggests that the observed rapid decrease in the concentration of glycine betaine is the result of the activation of osmoregulators which may play a key role in acid stress adaptation.
Supervisor: Not available Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available
Keywords: QA Mathematics ; QH426 Genetics