Use this URL to cite or link to this record in EThOS:
Title: Statistical approaches for copy number variation detection and association with complex human phenotypes
Author: De, Tisham
ISNI:       0000 0004 6348 0806
Awarding Body: Imperial College London
Current Institution: Imperial College London
Date of Award: 2014
Availability of Full Text:
Access from EThOS:
Access from Institution:
Copy number variants (CNVs) play an important role in the disease pathogenesis, including epilepsy, diabetes and many others. CNVs, are also known to affect cellular phenotypes through several phenomenon such as gene dosage. Next generation technologies for sequencing (DNA and RNA) and metabolite profiling (metabolomics) has led to the systematic discovery and evaluation of various genomic variants and their relationship to multiple phenotypes. Such approaches often involve application of several statistical and machine learning methods for unravelling new relationships between genomic variants and phenotypes i.e. disease outcomes or quantitative traits characterized at the molecular level. This thesis explores and develops several statistical methods for CNV detection and association with complex human phenotypes, in particular for epilepsy drug-response, epilepsy susceptibility, metabolomics and gene expression. In more detail, chapter 3, describes a genome wide CNV association analysis for two phenotypes including epilepsy susceptibility and epilepsy drug response. I have identified several important candidate genes for these two phenotypes, including the top most associated genes, SLC9A1 (p-value=6.69E-15) for epilepsy susceptibility and WWOX (p-value=1.93E-3) for epilepsy drug response. These associations were replicated in a separate Australian cohort and were further validated in lab and in-silico, leading to some positive and negative confirmation. In chapter 4, I present CNV association with metabolomic data in the exonic regions of the TSPAN8 gene. A strong association signal was detected in the 6th exon and 7th exon of the TSPAN8 gene, where a large proportion of metabonomic lipid phenotypes were found to be associated with univariate (P-value=7.64E-4) and multivariate (P-value=1.33E-6) approaches. These CNVs were also found to be nominally associated with type 2 diabetes (P-value=3.32e-7). In addition, I also carried out advanced multivariate based association analysis to corroborate these results and further reported sequencing based validation results for TSPAN8 exonic CNVs in different human populations from the 1000 genomes project. In chapter 5, I report a genome wide CNV association analysis with gene expression in ten different regions of the human brain. I identified a novel CNV near the DRD5 gene which was found to be strongly associated with gene expression. Further, I have reported on-going efforts to replicate and validate this finding. Each of these different phenotype categories analysed posed its own unique challenges and required specific approaches for analysis and interpretation.
Supervisor: Coin, Lachlan ; Prokopenko, Inga Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral