Use this URL to cite or link to this record in EThOS:
Title: Development and evaluation of statistical approaches in proteomic biomarker discovery
Author: Patel, Amit
ISNI:       0000 0004 2734 8832
Awarding Body: Cranfield University
Current Institution: Cranfield University
Date of Award: 2011
Availability of Full Text:
Access from EThOS:
Access from Institution:
A biomarker is a characteristic that is objectively measured and evaluated as an indicator of normal biological processes, pathogenic processes or pharmacological responses to a therapeutic intervention. The aim of this project was to deal with the identification of potential biomarker candidates from experimental data comparing samples displaying divergent physiological traits. Chapter 1 introduces the topic and the aims of the project. The primary aim was to identify the ideal statistical analysis methods and data pre- and post-treatment options to use for potential biomarker identification from proteomic datasets. The product of this work was a statistical analysis pipeline for identifying potential biomarker candidates from proteomic experimental data. Proteomic data often suffers from missing values, so methods to deal with these were also evaluated in this project. Chapter 2 outlines the data sets that were used as well as presenting an overview of the “Biomarker Hunter” pipeline software solution created in this project. Chapter 3 evaluates the appropriate univariate statistical methods to use for biomarker identification and the results of biomarker identification using these techniques. Chapter 4 evaluates options for data pre- and post-processing. Chapter 5 suggests the use of missing value imputation as well as offering a novel clustering algorithm to deal with missing values. The software pipeline also offers multivariate statistical methods, which are evaluated in Chapter 6. Chapter 7 provides some business context for both biomarker discovery and the statistical analysis software available for the purpose of proteomic biomarker discovery. As well as providing a software pipeline for the identification of biomarkers, the project aimed to identify a suggested strategy for statistical analysis of proteomic experimental data. Strong conclusions regarding the ideal statistical approach could only be made if the list of actual, validated biomarkers were available. Unfortunately this information was not available, but in the absence of this a strategy was suggested based on the available information from both the available literature and the author’s interpretation of the results from this study. In terms of data pre-processing, this strategy involved not averaging technical replicates, and using total abundance normalisation to reduce technical variation. A novel clustering algorithm was suggested to reduce the presence of missing values prior to existing methods of missing value imputation. Following statistical analysis multiple testing correction methods should be implemented to reduce the number of false positives.
Supervisor: Bessant, Conrad Sponsor: Oxford BioTherapeutics (OBT)
Qualification Name: Thesis (D.Eng.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available