Use this URL to cite or link to this record in EThOS:
Title: Bayesian nonparametric approaches to modelling dependencies in systems biology
Author: Zurauskiene, Justina
ISNI:       0000 0004 5922 6944
Awarding Body: Imperial College London
Current Institution: Imperial College London
Date of Award: 2014
Availability of Full Text:
Access from EThOS:
Full text unavailable from EThOS. Please try the link below.
Access from Institution:
All living organisms exhibit complex behaviour, and this is a result of the underlying regulatory mechanisms that occur at cellular and molecular levels. For this reason such reactions are of central importance in the field of systems biology. Throughout this thesis we are concerned with mathematical models that allow us to better under- stand and represent the biological phenomena behind experimental data, and equally to make predictions about key regulatory processes happening in the cells. Specifically, this work explores and demonstrates how modern Bayesian nonparametric techniques, namely Gaussian process regression and Dirichlet process mixture models, can be applied in order to model complex systems biology data. Here we have developed a new technique based on Gaussian process regression approaches to model metabolic regulatory processes at the cellular level. Our technique allows us to model noisy metabolite time course data and predicts dynamical metabolic flux behaviour in the associated pathways; we demonstrate that by learning the dependencies between several metabolites we can strengthen our predictions in sparsely sampled regions. We furthermore discuss when Gaussian processes can accurately reconstruct the underlying functions and when they are subject to the Nyquist limit. Next we proceed to modelling biological processes that occur at the molecular level. Here we are interested in studying large and diverse functional genomics datasets. A variety of computational techniques allow us to analyse such data and model biological processes underlying them; an important class of these methods are techniques that permit the detection of heterogeneity in experimentally observed data. Here we employ Dirichlet processes to estimate the number of clusters within such genomic datasets and further propose a new method to tackle the data fusion problem. Our technique primarily relies on the outcomes from nonparametric Bayesian clustering approaches and is based on graph theory concepts, but in parallel we also discuss and show how this graph-theoretical approach can be extended to integrate results from non-Bayesian type clustering algorithms. We show that by integrating several data types we can successfully identify e.g. sets of genes that are regulated by similar transcription factors.
Supervisor: Stumpf, Michael Sponsor: Leverhulme Trust
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available