Use this URL to cite or link to this record in EThOS:
Title: Unravelling biological processes using graph theoretical algorithms and probabilistic models
Author: Vangelov, Borislav
ISNI:       0000 0004 6061 4637
Awarding Body: Imperial College London
Current Institution: Imperial College London
Date of Award: 2015
Availability of Full Text:
Access from EThOS:
Access from Institution:
This thesis develops computational methods that can provide insights into the behaviour of biomolecular processes. The methods extract a simplified representation/model from samples characterising the profiles of different biomolecular functional units. The simplified representation helps us gain a better understanding of the relations between the functional units or between the samples. The proposed computational methods integrate graph theoretical algorithms and probabilistic models. Firstly, we were interested in finding proteins that have a similar role in the transcription cycle. We performed a clustering analysis on an experimental dataset using a graph partitioning algorithm. We found groups of proteins associated with different stages of the transcription cycle. Furthermore, we estimated a network model describing the relations between the clusters and identified proteins that are representative for a cluster or for the relation between two clusters. Secondly, we proposed a computational framework that unravels the structure of a biological process from high-dimensional samples characterising different stages of the process. The framework integrates a feature selection procedure and a feature extraction algorithm in order to extract a low-dimensional projection of the high-dimensional samples. We analysed two microarray datasets characterising different cell types part of the blood system and found that the extracted representations capture the structure of the hematopoietic stem cell differentiation process. Furthermore, we showed that the low-dimensional projections can be used as a basis for analysis of gene expression patterns. Finally, we introduced the geometric hidden Markov model (GHMM), a probabilistic model for multivariate time series data. The GHMM assumes that the time series lie on a noisy low-dimensional manifold and infers a dynamical model that reflects the low-dimensional geometry. We analysed multivariate time series data generated with a stochastic model of a biomolecular circuit and showed that the estimated GHMM captures the oscillatory behaviour of the circuit.
Supervisor: Barahona, Mauricio Sponsor: British Heart Foundation
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral