Use this URL to cite or link to this record in EThOS:
Title: Studies in probabilistic sequence alignment and evolution
Author: Holmes, I.
Awarding Body: University of Cambridge
Current Institution: University of Cambridge
Date of Award: 1999
Availability of Full Text:
Full text unavailable from EThOS.
Please contact the current institution’s library for further details.
The complete sequencing of whole genomes presents opportunities for detailed study of molecular evolution. This thesis combines theoretical developments of Bayesian approaches in bioinformatics with analysis of duplications in the recently completed C. elegans genome. Developments in the Bayesian probabilistic framework for sequence analysis using hidden Markov models (HMMs) are described. The principal HMM algorithms are reviewed including alignment, training and model comparison. Theory is developed for prediction of alignment accuracy and tested using simulations. Software to provide accuracy measures for multiple alignments, based on the popular HMMER suite of profile-based alignment algorithms, is presented and evaluated with reference to the Pfam database of multiple alignments. Several of these statistical techniques are applied to an analysis of genomic duplications in the C. elegans genome. The completion of this - the first animal genome - offers an opportunity to study the random duplications that are believed to be the first step in the evolution of a new gene. The construction of a database of non-coding duplications is described and measurements of molecular evolutionary parameters in C. elegans are calculated from the data and reported. A method of dating gene duplications using alignments between conserved introns is presented and compared to existing methods using Bayesian techniques developed earlier in the dissertation. Amongst the principal agents involved in creating genomic duplications are transposons; one of the simplest families of transposon is the Tcl-mariner family, of which two distinct active subfamilies are well-known in C. elegans. Using HMM profiles, six new subfamilies of mariner-like transposon have been identified in the C. elegans genome. Several of the new subfamilies display interesting homologies to one another, suggestive of common mechanisms of transpositional catalysis. Finally, the software tools developed during this project are described and made available for public retrieval from the Sanger Centre web site.
Supervisor: Not available Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available