Use this URL to cite or link to this record in EThOS:
Title: Genealogy estimation for thousands of samples
Author: Speidel, Leo
ISNI:       0000 0004 8507 7583
Awarding Body: University of Oxford
Current Institution: University of Oxford
Date of Award: 2020
Availability of Full Text:
Access from EThOS:
Full text unavailable from EThOS. Please try the link below.
Access from Institution:
A key and fundamental concept that captures our shared genetic history is the genealogy, which traces the genetic relationships of present-day individuals to their most-recent common ancestors. Knowledge of the genealogy would, in principle, capture all evolutionary forces that modified the genetic material ancestral to our DNA, and would hence simplify - and enhance - many inference problems about past demography and evolution. Despite their importance, estimation of genealogies has remained unsolved even for moderately sized data sets, with existing methods unable to handle sample sizes beyond a few hundred samples, yet modern data sets often exceed tens of thousands of samples. In this thesis, I present a method, Relate, that estimates such genealogies for thousands of samples. I demonstrate on a variety of population genetic applications that Relate-based inferences improve in accuracy, resolution, or statistical power on state-of-the-art alternatives. I then reconstruct the genealogy of 2478 humans from 26 populations. I infer historical population sizes and population split times with higher resolution than previously possible and identify highly diverged lineages, reflecting Neanderthal and Denisovan introgression in non-Africans, and unknown events in Africans. I report regions that show evidence of being under strong positive selection that were previously unreported and identify multi-allelic traits likely to be under selection. I additionally apply Relate to 50 wild mice sampled in France, India, and Taiwan and demonstrate that the estimated genealogies contain rich information about their demographic history, mutation rate trends consistent with GC biased gene conversion, as well as strong indications of selective sweeps in each population.
Supervisor: Myers, Simon Sponsor: Engineering and Physical Sciences Research Council
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available
Keywords: Genetics ; Statistics