Use this URL to cite or link to this record in EThOS:
Title: Multivariate prediction models for bio-analytical data
Author: Rantalainen, Mattias John
ISNI:       0000 0000 7184 587X
Awarding Body: Imperial College London
Current Institution: Imperial College London
Date of Award: 2008
Availability of Full Text:
Access from EThOS:
Access from Institution:
Quantitative bio-analytical techniques that enable parallel measurements of large numbers of biomolecules generate vast amounts of information for studying and characterising biological systems. These analytical methods are commonly referred to as omics technologies, and can be applied for measurements of e.g. mRNA transcript, protein or metabolite abundances in a biological sample. The work presented in this thesis focuses on the application of multivariate prediction models for modelling and analysis of biological data generated by omics technologies. Omics data commonly contain up to tens of thousands of variables, which are often both noisy and multicollinear. Multivariate statistical methods have previously been shown to be valuable for visualisation and predictive modelling of biological and chemical data with similar properties to omics data. In this thesis currently available multivariate modelling methods are used in new applications, and new methods are developed to address some of the specific challenges associated with modelling of biological data. Three closely related areas of multivariate modelling of biological data are described and demonstrated in this thesis. First, a multivariate projection method is used in a novel application for predictive modelling between omics data sets, demonstrating how data from two analytical sources can be integrated and modelled to- gether by exploring covariation patterns between the data sets. This approach is exemplified by modelling of data from two studies, the first containing proteomic and metabolic profiling data and the second containing transcriptomic and metabolic profiling data. Second, a method for piecewise multivariate modelling of short timeseries data is developed and demonstrated by modelling of simulated data as well as metabolic profiling data from a toxicity study, providing a new method for characterisation of multivariate bio-analytical time-series data. Third, a kernel-based method is developed and applied for non-linear multivariate prediction modelling of omics data, addressing the specific challenge of modelling non-linear variation in biological data.
Supervisor: Holmes, Elaine ; Nicholson, Jeremy Sponsor: METAGRAD
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral