Use this URL to cite or link to this record in EThOS:
Title: Statistical analysis of natural selection in RNA virus populations
Author: Bhatt, Samir
ISNI:       0000 0004 2710 9145
Awarding Body: University of Oxford
Current Institution: University of Oxford
Date of Award: 2010
Availability of Full Text:
Access from EThOS:
Full text unavailable from EThOS. Please try the link below.
Access from Institution:
A key goal of modern evolutionary biology is the identification of genes or genome regions that have been targeted by natural selection. Methods for detecting natural selection utilise the information sampled in contemporary gene sequences and test for deviation from the null hypothesis of neutrality. One such method is the McDonald Kreitman test (MK test), which detects the the molecular 'footprint' left by natural selection by considering the frequency of observed mutations within the sampled population. In this thesis I investigate the applicability of the MK test to viral populations and develop several new methods based on the original MK test. In chapter 2, I use a combination of simulation and methodological improvements to show that the MK test can have low error when applied to analysis of RNA virus populations. Then, in chapter 3, I develop an extension of the MK test with the purpose of estimating rates of adaptive fixation for all genes of the human influenza A virus subtypes H1N1 and H3N2. My results are consistent with previous studies on selection in influenza virus populations, and provide a new perspective on the evolutionary dynamics of human influenza virus. In chapter 4 I develop a formal statistical framework based, on the MK test, for calculating the number of non neutral sites at any frequency range in the site frequency spectrum. In this framework, I introduce a new method for reconstructing the site frequency spectrum that incorporates sampling error and allows for the inclusion of prior knowledge. Using this new framework I show that the majority of nucleotide sites in hepatitis C virus sequences sampled during chronic infection represent deleterious mutations. Finally, in chapter 5 I use the generalised framework introduced in chapter 4 to develop a statistic for evaluating the deleterious mutation load of a population. I apply this test sequences that represent 96 RNA virus genes and show that my approach has comparable power to equivalent phylogenetic methods. In this thesis I have developed computationally efficient methods for analysis of genetic data from virus populations. It is my hope that these methods will become useful given the explosion in sequence data that has accompanied recent improvements in sequencing technology.
Supervisor: Pybus, Oliver Sponsor: Natural Environment Research Council
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available
Keywords: Evolution (zoology) ; Mathematical genetics and bioinformatics (statistics) ; evolution ; RNA virus ; adaptation ; influenza ; McDonald Kreitman Test ; Site Frequency Spectrum