Use this URL to cite or link to this record in EThOS:
Title: Model and algorithm development in computational phylogenetics
Author: Money, Daniel
Awarding Body: University of Manchester
Current Institution: University of Manchester
Date of Award: 2012
Availability of Full Text:
Access from EThOS:
Access from Institution:
Phylogenetic trees play an important role in many areas of biological research, from helping us to better understand the process of evolution to drug discovery. Researchers in these areas must be confident that the methods available to them will provide a 'good' phylogenetic tree; one that adequately describes the evolution of the taxa in the tree. One use of such a good tree is in modelling gene family evolution. Gene family size is closely related to copy number variation, with the later shown to be related to important phenotypic effects such as causing disease or conferring drug resistance. It is therefore important to establish good methods for inferring gene family evolution as a better understanding of how gene families evolve could help answer many important biological questions. In this thesis I investigate methods of searching for a 'good' phylogenetic tree and the space of trees that these methods search. I study methods for finding the neighbours of a tree, an important step in many tree search algorithms. I explore the structure of such methods and investigate how likely they are to find a 'good' tree. I look for patterns in the structure of tree-space that may make tree search easier or allow the best focusing of limited resources when searching for a 'good' tree. I also study methods for inferring gene family evolution. I first compare two different methods for inferring gene family history, maximum parsimony and maximum likelihood. I then investigate the models used in the maximum likelihood method to determine which models best describe the data and what we can learn about gene family evolution from those models. I conclude that the Nearest Neighbour Interchange (NNI) technique should not be used as it regularly finds 'bad' optima and the structure of the method means that tree search algorithms are likely to get stuck at poor optima. Instead I recommend the use of the Subtree Pruning and Re-grafting (SPR), or other more complex, methods. I also find several properties of tree space that may be helpful to those designing, or using, the algorithms. My results suggest that maximum likelihood should normally be used for modelling gene family evolution but that further research is needed into the models used if we are to be confident in the conclusions we draw.
Supervisor: Not available Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available
Keywords: Tree search ; Gene family ; Phylogenetics