Use this URL to cite or link to this record in EThOS: http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.580985
Title: Bayesian methods for estimating human ancestry using whole genome SNP data
Author: Churchhouse, Claire
Awarding Body: University of Oxford
Current Institution: University of Oxford
Date of Award: 2012
Availability of Full Text:
Access from EThOS:
Full text unavailable from EThOS. Restricted access.
Access from Institution:
Abstract:
The past five years has seen the discovery of a wealth of genetics variants associated with an incredible range of diseases and traits that have been identified in genome- wide association studies (GWAS). These GWAS have typically been performed in in- dividuals of European descent, prompting a call for such studies to be conducted over a more diverse range of populations. These include groups such as African Ameri- cans and Latinos as they are recognised as bearing a disproportionately large burden of disease in the U.S. population. The variation in ancestry among such groups must be correctly accounted for in association studies to avoid spurious hits arising due to differences in ancestry between cases and controls. Such ancestral variation is not all problematic as it may also be exploited to uncover loci associated with disease in an approach known as admixture mapping, or to estimate recombination rates in admixed individuals. Many models have been proposed to infer genetic ancestry and they differ in their accuracy, the type of data they employ, their computational efficiency, and whether or not they can handle multi-way admixture. Despite the number of existing models, there is an unfulfilled requirement for a model that performs well even when the ancestral populations are closely related, is extendible to multi-way admixture scenarios, and can handle whole- genome data while remaining computationally efficient. In this thesis we present a novel method of ancestry estimation named MULTIMIX that satisfies these criteria. The underlying model we propose uses a multivariate nor- mal to approximate the distribution of a haplotype at a window of contiguous SNPs given the ancestral origin of that part of the genome. The observed allele types and the ancestry states that we aim to infer are incorporated in to a hidden Markov model to capture the correlations in ancestry that we expect to exist between neighbouring sites. We show via simulation studies that its performance on two-way and three-way admixture is competitive with state-of-the-art methods, and apply it to several real admixed samples of the International HapMap Project and the 1000 Genomes Project.
Supervisor: Marchini, Jonathan Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID: uk.bl.ethos.580985  DOI: Not available
Keywords: Bioinformatics (life sciences) ; Genetics (life sciences) ; Probability theory and stochastic processes ; Statistics (see also social sciences) ; Computationally-intensive statistics ; Mathematical genetics and bioinformatics (statistics) ; Probability ; Stochastic processes ; genotype ; admixture ; Bayesian
Share: