Use this URL to cite or link to this record in EThOS:
Title: Statistical methods for elucidating copy number variation in high-throughput sequencing studies
Author: Bellos, Evangelos
ISNI:       0000 0004 5349 6126
Awarding Body: Imperial College London
Current Institution: Imperial College London
Date of Award: 2014
Availability of Full Text:
Access from EThOS:
Full text unavailable from EThOS. Please try the link below.
Access from Institution:
Copy number variation (CNV) is pervasive in the human genome and has been shown to contribute significantly to phenotypic diversity and disease aetiology. High-throughput sequencing (HTS) technologies have allowed for the systematic investigation of CNV at an unprecedented resolution. HTS studies offer multiple distinct features that can provide evidence for the presence of CNV. We have developed an integrative statistical framework that jointly analyses multiple sequencing features at the population level to achieve sensitive and precise discovery of CNV. First, we applied our framework to low-coverage whole-genome sequencing experiments and used data from the 1000 Genomes Project to demonstrate a substantial improvement in CNV detection accuracy over existing methods. Next, we extended our approach to targeted HTS experiments, which offer improved cost-efficiency by focusing on a predetermined subset of the genome. Targeted HTS involves an enrichment step that introduces non-uniformity in sequencing coverage across target regions and thus hinders CNV identification. To that end, we designed a customized normalization procedure that counteracts the effects of enrichment bias and enhances the underlying CNV signal. Our extended framework was benchmarked on contiguous capture datasets, where it was shown to outperform competing strategies by a wide margin. Capture sequencing can also generate large amounts of data in untargeted genomic regions. Although these off-target results can be a valuable source of CNV evidence, they are subject to complex enrichment patterns that confound their interpretation. Therefore, we developed the first normalization strategy that can adapt to the highly heterogeneous nature of off-target capture and thus facilitate CNV investigation in untargeted regions. All in all, we present a generalized CNV detection toolset that has been shown to achieve robust performance across datasets and sequencing platforms and can therefore provide valuable insight into the prevalence and impact of CNV.
Supervisor: Coin, Lachlan; Johnson, Michael Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available