Use this URL to cite or link to this record in EThOS:
Title: Prediction and validation of protein interaction
Author: Chen, Pao-Yang
ISNI:       0000 0001 3535 8107
Awarding Body: University of Oxford
Current Institution: University of Oxford
Date of Award: 2008
Availability of Full Text:
Full text unavailable from EThOS.
Please contact the current institution’s library for further details.
The biological functions of a protein within the cell are governed by its protein interactions. While these interactions have recently become widely available for many organisms, they are not yet fully explored with regards to the insights into protein characteristics they might provide. In this dissertation, I not only explore the protein characteristics, e.g, structure and function, but also further utilise these characteristics in the prediction of protein interactions. An exploratory analysis of protein interactions as networks reveals local clustering including a tendency of protein triangle formation. This observation prompted me to develop statistical approaches beyond pairwise interactions, as well as considering the relationship between protein characteristics and protein interactions. My methods consider both pairwise and triple-wise interactions. Prior information from other organisms is also incorporated, and diverse biological characteristics can be integrated simultaneously. In the task of predicting protein characteristics, for large networks my pair-based score is more accurate than the popular Majority Vote method. Surprisingly, however, my triple-based score does not outperform the simpler pair-based method. This may be due to poor data quality and/or other unknown biological factors. In the task of predicting protein interactions, I demonstrate that the inclusion of network structure in the form of triples significantly improves results over three other standard interaction predictors as well as a pair based version of the method. It also achieves a greater coverage. Unsurprising, therefore it appears that different task may require different models. A global model for protein interaction networks based on triples is outperformed by a pairwise method when it comes to prediCting protein characteristics. Conversely, a triple-base method outperforms a model based on pairwise interactions when it comes to predicting interactions. . My methods offer three main improvements over current approaches. Firstly, they consider network structures of pairs and triples in the prediction. Secondly, in all my methods multiple protein characteristics can be considered simultaneously, which greatly improves the prediction. Thirdly, data from multiple species can be easily integrated~ Interestingly on this last point the result suggests that there may be fundamental differences between the networks of eukaryotes and prokaryotes.
Supervisor: Not available Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available