Use this URL to cite or link to this record in EThOS:
Title: Orthologous pair transfer and hybrid Bayes methods to predict the protein-protein interaction network of the Anopheles gambiae mosquitoes
Author: Li, Qiuxiang
ISNI:       0000 0004 2678 1562
Awarding Body: Imperial College London
Current Institution: Imperial College London
Date of Award: 2009
Availability of Full Text:
Access from EThOS:
Access from Institution:
Based on the published protein-protein interaction maps of five organisms and other public databases for domain-domain and protein-protein interactions, two new approaches are proposed to infer the protein-protein interaction network of the Anopheles gambiae (A. gambiae) mosquitoes. Our main contributions are: i) Adopted an orthologous protein pair transfer method that has so far not been seen in literature; ii) Proposed a new hybrid Bayes method; iii) Used voting machines at two levels of the combined classifier/predictor; iv) Used heterogeneous datasets as the training data; v) And finally, used the trained classifier to predict the protein interactions maps for A. gambiae, arguably one of a few least known organisms in terms of protein interaction mechanism. With the first method, the orthologous and in-paralogous protein clusters are extracted for both species. The relations between two peer-to-peer proteins in the two species are identified so that the interactions in the D. melanogaster protein interaction maps are transferred to pairs of interacting proteins in A. gambiae. The second strategy, namely the hybrid Bayes, is based on the domain composition of proteins, with which we utilize a probability model to build virtual domain-domain maps by integrating large-scale protein interaction data from five organisms, namely Saccharomyces cerevisiae, Caenorhabditis elegans, Escherichia coli, Mus musculus and Drosophila melanogaster. For the hybrid Bayes method, once the virtual domain-domain interaction maps are constructed, we propose two ways to predict the protein-protein interaction maps. These two methods are compared and then combined to form a voting machine to collectively decide a protein-pair's candidacy. The users could adjust the weights for different methods to flexibly control the output. Parameters are chosen through running different experiments on the training data set. While both the orthologous cluster and hybrid Bayes methods produce encouraging results the second one predicts more protein-protein interaction than the first. Yet these two data sets share a very small fraction of common interactions. We adopt a second voting machine and calibrate the parameters with the putative protein interaction data. Those parameters for the voting machine are used to predict the protein-protein interaction maps of the A. gambiae and produces reasonably good results.
Supervisor: Crisanti, Andrea ; Muggleton, Stephen Sponsor: ORS Scholarship
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral