Use this URL to cite or link to this record in EThOS:
Title: Methods for demographic inference from single-nucleotide polymorphism data
Author: Mair, Colette
ISNI:       0000 0004 2728 1578
Awarding Body: University of Glasgow
Current Institution: University of Glasgow
Date of Award: 2012
Availability of Full Text:
Access from EThOS:
Access from Institution:
The distribution of the current human population is the result of many complex historical and prehistorical demographic events that have shaped variation in the human genome. Genomic dissimilarities between individuals from different geographical regions can potentially unveil something of these processes. The greatest differences lie between, and within, African populations and most research suggests the origin of modern humans lies within Africa. However, differing models have been proposed to model the evolutionary processes leading to humans inhabiting most of the world. This thesis develops a hypothesis test shown to be powerful in distinguishing between two such models. The first ("migration") model assumes the population of interest is divided into subpopulations that exchange migrants at a constant rate arbitrarily far back in the past, whilst the second ("isolation") model assumes that an ancestral population iteratively segregates into subpopulations that evolve independently. Although both models are simplistic, they do capture key aspects of the opposing theories of the history of modern humans. Given single nucleotide polymorphism (SNP) data from two subpopulations, the method described here tests a global null hypothesis that the data are from an isolation model. The test takes a parametric bootstrap approach, iteratively simulating data under the null hypothesis and computing a set of summary statistics shown to be able to distinguish between the two models. Each summary statistic forms the basis of a statistical hypothesis test where the observed value of the statistic is compared to the simulated values. The global null hypothesis is accepted if each individual test is accepted. A correction for multiple comparisons is used to control the type I error rate of this compound test. Extensions to this hypothesis test are given which adapt it to deal with SNP ascertainment and to better handle large genomic data sets. The methods are illustrated on data from the HapMap project using two Kenyan populations and the Japanese and Yoruba populations, after the method has been validated by simulation, where the `true' model is known.
Supervisor: Not available Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available
Keywords: HA Statistics