Use this URL to cite or link to this record in EThOS:
Title: Model selection strategies in genome-wide association studies
Author: Keildson, Sarah
ISNI:       0000 0004 2723 8393
Awarding Body: University of Oxford
Current Institution: University of Oxford
Date of Award: 2011
Availability of Full Text:
Access from EThOS:
Full text unavailable from EThOS. Please try the link below.
Access from Institution:
Unravelling the genetic architecture of common diseases is a continuing challenge in human genetics. While genome-wide association studies (GWAS) have proven to be successful in identifying many new disease susceptibility loci, the extension of these studies beyond single-SNP methods of analysis has been limited. The incorporation of multi-locus methods of analysis may, however, increase the power of GWAS to detect genes of smaller effect size, as well as genes that interact with each other and the environment. This investigation carried out large-scale simulations of four multi-locus model selection techniques; namely forward and backward selection, Bayesian model averaging (BMA) and least angle regression with a lasso modification (lasso), in order to compare the type I error rates and power of each method. At a type I error rate of ~5%, lasso showed the highest power across varied effect sizes, disease frequencies and genetic models. Lasso penalized regression was then used to perform three different types of analysis on GWAS data. Firstly, lasso was applied to the Wellcome Trust Case Control Consortium (WTCCC) data and identified many of the WTCCC SNPs that had a moderate-strong association (p<10-5) type 2 diabetes (T2D), as well as some of the moderate WTCCC associations (p<10-4) that have since been replicated in a large-scale meta-analysis. Secondly, lasso was used to fine-map the 17q21 childhood asthma risk locus and identified putative secondary signals in the 17q21 region, that may further contribute to childhood asthma risk. Finally, lasso identified three potential interaction effects potentially contributing towards coronary artery disease (CAD) risk. While the validity of these findings hinges on their replication in follow-up studies, the results suggest that lasso may provide scientists with exciting new methods of dissecting, and ultimately understanding, the complex genetic framework underlying common human diseases.
Supervisor: Farrall, Martin ; Morris, Andrew Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available
Keywords: Genetics (life sciences) ; common diseases ; genome-wide association studies ; model selection