Use this URL to cite or link to this record in EThOS: http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.603116
Title: Using phylogenetics and model selection to investigate the evolution of RNA genes in genomic alignments
Author: Allen, James
Awarding Body: University of Manchester
Current Institution: University of Manchester
Date of Award: 2013
Availability of Full Text:
Access from EThOS:
Access from Institution:
Abstract:
The diversity and range of the biological functions of non-coding RNA molecules (ncRNA) have only recently been realised, and phylogenetic analysis of the RNA genes that define these molecules can provide important insights into the evolutionary pressures acting on RNA genes, and can lead to a better understanding of the structure and function of ncRNA. An appropriate dataset is fundamental to any evolutionary analysis, and because existing RNA alignments are unsuitable, I describe a software pipeline to derive RNA gene datasets from genomic alignments. RNA gene prediction software has not previously been evaluated on such sets of known RNA genes, and I find that two popular methods fail to predict the genes in approximately half of the alignments. In addition, high numbers of predictions are made in flanking regions that lack RNA genes, and these results provide motivation for subsequent phylogenetic analyses, because a better understanding of RNA gene evolution should lead to improved methods of prediction. I analyse the RNA gene alignments with a range of evolutionary models of substitution and examine which models best describe the changes evident in the alignment. The best models are expected to provide more accurate trees, and their properties can also shed light on the evolutionary processes that occur in RNA genes. Comparing DNA and RNA substitution models is non-trivial however, because they describe changes between two different types of state, so I present a proof that allows models with different state spaces to be compared in a statistically valid manner. I find that a large proportion of RNA genes are well described by a single RNA model that includes parameters describing both nucleotides and RNA structure, highlighting the multiple levels of constraint that act on the genes. The choice of model affects the inference of a phylogenetic tree, suggesting that model selection, with RNA models, should be standard practice for analysis of RNA genes.
Supervisor: Lovell, Simon; Whelan, Simon Sponsor: Natural Environment Research Council ; European Bioinformatics Institute (EMBL-EBI)
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID: uk.bl.ethos.603116  DOI: Not available
Keywords: ncRNA ; PHASE software ; model selection
Share: