Use this URL to cite or link to this record in EThOS:
Title: The detection, structure and uses of extended haplotype identity in population genetic data
Author: Xifara, Dionysia-Kiara
ISNI:       0000 0004 5915 6413
Awarding Body: University of Oxford
Current Institution: University of Oxford
Date of Award: 2014
Availability of Full Text:
Access from EThOS:
Full text unavailable from EThOS. Please try the link below.
Access from Institution:
In large-scale population genomic data sets, individual chromosomes are likely to share extended regions of haplotype identity with others in the sample. Patterns of local haplotype sharing can be highly informative about many processes including historical demography, selection and recombination. However, in outbred diploid populations, the identification of extended shared haplotypes is not straightforward, particularly in the presence of low levels of genotyping error. Here, we introduce a model-based method for accurately detecting extended haplotype sharing between sets of individuals from unphased data. We describe two implementations of the algorithm that can be applied to data sets consisting of thousands of samples. The method leads naturally to an approach for statistical haplotype estimation, which is shown to be comparable in accuracy to current methods. By applying the method to genome-wide SNP data from over 5,000 samples from the UK we show that the N50 maximal haplotype sharing between unrelated samples is typically 2cM, consistent with a population history of rapid exponential growth that started approx. 125 generations ago. In contrast, within two Greek population isolates of approx. 700 individuals the N50 for maximal haplotype sharing is 12.5cM, while for an unrelated Greek sample of the same size the N50 is 1.3cM. By assessing the size and geographical distribution of maximal haplotype sharing within and between all Greek cohorts, we make inference on the extent of isolatedness of each cohort and on recent migration. We additionally date recent coancestry to about 10 generations for the isolates and 90 generations for the unrelated sample, and finnally attempt to date the time of divergence between them.
Supervisor: McVean, Gilean Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available
Keywords: Mathematical genetics and bioinformatics (statistics) ; haplotype estimation ; identity by descent ; extended haplotype sharing ; ancestry