Use this URL to cite or link to this record in EThOS:
Title: Simultaneous estimation of population size changes and splits times using importance sampling
Author: Forest, Marie
ISNI:       0000 0004 5357 2862
Awarding Body: University of Oxford
Current Institution: University of Oxford
Date of Award: 2014
Availability of Full Text:
Access from EThOS:
Full text unavailable from EThOS. Restricted access.
Access from Institution:
The genome is a treasure trove of information about the history of an individual, his population, and his species. For as long as genomic data have been available, methods have been developed to retrieve this information and learn about population history. Over the last decade, large international genomic projects (e.g. the HapMap Project and the 1000 Genomes Project) have offered access to high quality data collected from thousands of individuals from a vast number of populations. Freely available to all, these databases offer the possibility to develop new methods to uncover the history of the peopling of the world by modern humans. Due to the complexity of the problem and the large amount of available data, all developed methods either simplify the model with strong assumptions or use an approximation; they also dramatically down-sample their data by either using fewer individuals or only portions of the genome. In this thesis, we present a novel method to jointly estimate the time of divergence of a pair of populations and their variable sizes, a previously unsolved problem. The method uses multiple regions of the genome with low recombination rate. For each region, we use an importance sampler to build a large number of possible genealogies, and from those we estimate the likelihood function of parameters of interest. By modelling the population sizes as piecewise constant within fixed time intervals, we aim to capture population size variation through time. We show via simulation studies that the method performs well in many situations, even when the model assumptions are not totally met. We apply the method to five populations from the 1000 Genomes Project, obtaining estimates of split times between European groups and among Europe, Africa and Asia. We also infer shared and non-shared bottlenecks in out-of- Africa groups, expansions following population separations, and the sizes of ancestral populations further back in time.
Supervisor: Marchini, Jonathan; Myers, Simon Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available
Keywords: Mathematical genetics and bioinformatics (statistics) ; Stochastic processes ; Statistics ; methodology ; population history ; coalescent theory ; genetics