Use this URL to cite or link to this record in EThOS:
Title: Integrated Bayesian approach to incorporate prior biological knowledge in the investigation of genetic variants associated with lung function
Author: das Neves Sousa Pereira, Miguel Maria
ISNI:       0000 0004 7655 398X
Awarding Body: University of London
Current Institution: Imperial College London
Date of Award: 2018
Availability of Full Text:
Access from EThOS:
Access from Institution:
Genetic association studies aim to identify single-nucleotide polymorphisms (SNPs) that are associated with complex diseases and traits. These studies, especially those performed at the genome-wide level, have successfully identified many SNPs associated with several diseases but current findings explain only a small fraction of their estimated genetic component. This thesis contributes to the scientific literature by developing a novel statistical approach for the analysis of genetic association studies and by applying it to the discovery of new SNPs associated with lung function. Lung function has an estimated genetic component of ~40% and an important epidemiological impact since low lung function has been associated with all-cause mortality. First, I investigate a Bayesian hierarchical shrinkage model that performs a joint analysis of SNPs and includes biological information about the SNPs in the analysis. The approach involves creating a score of biological knowledge retrieved from online databases which is used to inform the shrinkage towards a null SNP effect. The performance of this approach was investigated in a simulation study and a real data example where the true associations were known. It consistently outperformed the standard analysis, where SNPs are analysed separately using linear regression models. Second, I apply the approach developed to data from the large UK Biobank study to investigate the effects on lung function of variants located in 403 genes related to lung development. The results from this analysis were then replicated using three independent cohorts. The combination of the standard analysis with the Bayesian approach allowed the identification of 65 new variants in 31 different genes, which nearly doubles the number of known variants associated with lung function. Finally, I present Bioshrink, a user-friendly R Shiny application that is freely accessible and allows third-party users to apply the Bayesian approach to their own genetic association studies.
Supervisor: Minelli, Cosetta ; Thompson, John Sponsor: National Heart and Lung Institute
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral