Use this URL to cite or link to this record in EThOS: http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.730328
Title: Methods to jointly analyze multiple phenotypes
Author: Dahl, Andrew
ISNI:       0000 0004 6496 1193
Awarding Body: University of Oxford
Current Institution: University of Oxford
Date of Award: 2016
Availability of Full Text:
Access from EThOS:
Full text unavailable from EThOS. Restricted access.
Access from Institution:
Abstract:
Linear mixed models (LMMs) have re-emerged as a central tool in statistical genetics. Fixed effects capture genetic variants tested for association. Random effects leverage aggregate relatedness while remaining agnostic to specific genetic mechanisms, naturally modeling heritability and controlling for polygenic background and confounding from population or family structure in genome-wide association studies (GWAS). Multiple random effects can partition heritability amongst many biologically meaningful variance components (VCs). Concurrently, genetic studies have begun to analyze multiple traits. This can improve power by adding data and can inform the path from genotype to phenotype, e.g. with graphical models, pleiotropy detection or endophenotyping. Multi-trait analyses are natural for biobanks and high-throughput phenotypic measurements like gene expression, medical images or metabolites. Following these advances, this thesis develops three multi-trait mixed models. phenix imputes missing phenotype data by modifying probabilistic matrix factorization to incorporate genetic relatedness. General linear mixed models (GLMMs) generalize and unify multi-VC, multi-trait and likelihood-penalized mixed models. Finally, compressive mixed models (CMMs) combine the two, obtaining the imputation and computational benefits of phenix and the heritability estimation and multi-VC capabilities of GLMMs. phenix essentially always outperforms all competitors in imputation and can improve GWAS power. GLMMs accurately estimate heritability despite (measured) confounders, can improve phenotype prediction, and increase gene-based, multi-trait association signal. CMMs regularly improve prediction, scale to thousands of phenotypes, and can uncover plausible GWAS hits entirely missed by LMMs. Altogether, multi-trait mixed models are invaluable for intrinsically multitrait tasks, like phenotype imputation and low-rank decomposition, and, surprisingly, can be much faster than LMMs; however, I find only small, and inconsistent, benefits for single-trait-oriented objectives like heritability estimation and out-of-sample prediction. The challenges in this thesis are primarily computational. Naively, multi-trait approaches model an N × P matrix of P phenotypes measured on N samples as a long Gaussian vector, inducing prohibitive O(N3P3) computations. Fortunately, the parsimonious matrix normals underlying mixed models enable simpler O(N3+P3) expressions. This is summarized by a new decomposition for positive semidefinite tensor products that, under a crucial assumption, facilitates cheap evaluation of ubiquitous low-level operations like multiplication.
Supervisor: Marchini, Jonathan Sponsor: Wellcome Trust
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID: uk.bl.ethos.730328  DOI: Not available
Share: