Use this URL to cite or link to this record in EThOS:
Title: Automated methods for the determination of homologous relationships and functional similarities between protein domains
Author: Redfern, Oliver Charles
ISNI:       0000 0004 2670 3409
Awarding Body: University of London
Current Institution: University College London (University of London)
Date of Award: 2007
Availability of Full Text:
Access from EThOS:
Full text unavailable from EThOS. Please try the link below.
Access from Institution:
CATH is a protein database of structural domains which are assigned to superfamilies through evidence of a common evolutionary ancestor. These superfamilies are further grouped by overall structural similarity into folds. This thesis explores several automated methods for recognising homologous relationships between these domains using the structural data from the Protein Data Bank (PDB). The aim of this work was to aid the manual classification of domains into the database and provide putative functional assignments to structures solved by the structural genomics initiatives. A fast and novel algorithm, CATHEDRAL, was developed to make fold assignments to regions of polypeptide chains. By combining a fast secondary-structure method (GRATH) and a slower residue-based method (SSAP), the algorithm was able to accurately assign boundaries for distant relatives, undetectable by sequence methods. Sequence and structural conservation patterns were combined in a novel algorithm, FLORA, to develop structural templates specific to catalytic function. FLORA was able to predict the correct functional site in 80% of cases and combined with global structure comparison, it was able to assign domains to enzyme families within diverse superfamilies. Techniques in structure comparison were also applied to ab initio models of protein domains, in order to assign them to fold groups within the CATH database. A novel scoring method was developed to pre-select models that were more likely to have adopted the correct fold. A selected sample of models for each target structure was then compared against representatives from the CATH database using the MAMMOTH and SSAP algorithms. Data from these alignments were combined using a Support Vector Machine to assign the target to a fold group within CATH. This work was generously supported by the Engineering and Physical Sciences Research Council.
Supervisor: Not available Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available