Use this URL to cite or link to this record in EThOS:
Title: Bioinformatic analysis of human Next Generation Sequencing data : extracting additional information, optimising mapping and variant calling, and application in a rare disease
Author: Sood, Roshan Kumar
ISNI:       0000 0004 7972 1299
Awarding Body: University of Southampton
Current Institution: University of Southampton
Date of Award: 2019
Availability of Full Text:
Access from EThOS:
Full text unavailable from EThOS. Please try the link below.
Access from Institution:
With the increased application of Next Generation Sequencing (NGS) to medicine it is important to test and develop approaches to extract the optimum information from datasets. In this thesis five aspects of NGS are investigated ranging from quality control to variant calling. Firstly a method to estimate contamination from a VCF file was developed which would be useful in cases where no BAM file was available to use existing tools. Unmapped reads were investigated to extract additional information from NGS samples and were able to detect the abundance of oral microbes from saliva samples relative to blood collected samples, but failed to identify differences between inflammatory bowel disease patients and controls. For a familial trio with a reported rare case of Sedaghatian-type spondylometaphyseal dysplasia (SSMD) sequenced both by whole exome (WES) and genome (WGS) sequencing it was shown that nearly all coding variants from WES were called in WGS despite differences in mean depth of coverage. This comparison highlighted that as sequencing costs decrease WGS will offer the greatest diagnostic value with potential for future re-analysis of cases currently unable to be resolved. Using the familial trio attempts were made to identify causal variant(s) in the gene currently implicated in causing SSMD - Glutathione peroxidase 4 (GPX4 ). However no variants either small SNPs or large structural were identified over the GPX4 gene and no plausible candidates were identified from the trio. Finally variant calling of the FCGR low affinity locus was performed using targeted NGS. FCGR genes have been highly duplicated and so by using customised references it was possible to infer the combinations of alleles across homologous sites. Using this approach it was possible to predict SNPs in the FCGR3B gene and predict human neutrophil antigen haplotypes involved in the immune response to treatments such as monoclonal antibodies.
Supervisor: Gibson, Jane Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available