Use this URL to cite or link to this record in EThOS:
Title: A bioinformatic analysis of Mycobacterium tuberculosis and host genomic data
Author: Phelan, J.
ISNI:       0000 0004 6497 2757
Awarding Body: London School of Hygiene & Tropical Medicine
Current Institution: London School of Hygiene and Tropical Medicine (University of London)
Date of Award: 2018
Availability of Full Text:
Access from EThOS:
Access from Institution:
Human tuberculosis disease (TB) is caused by bacteria within the Mycobacterium tuberculosis complex, including M. tuberculosis (Mtb). Genetic variation within the pathogen can lead to drug resistance, affect virulence and transmissibility. I have analysed Mtb whole genome sequence data to improve the understanding of global genetic variation, and the resulting insights could ultimately assist the development of TB control measures. Whole genome sequencing platforms are being used to infer drug resistance profiles, and thereby could assist clinical management. I investigated the reproducibility of sequence data from two platforms (Illumina MiSeq, Ion Torrent PGM™) and two rapid analytic pipelines (TBProfiler, Mykrobe predictor). DNA replicates from the reference strain (H37Rv) and 10 drug-resistant strains were sequenced, and inferred drug resistance genotypes were compared to drug susceptibility testing phenotypes. Genome-wide association study (GWAS) can be used to detect mutations associated with Mtb drug resistance. A first GWAS (n=127) attempted to identify mutations associated with minimum inhibitory concentrations for first-line anti-tuberculosis drugs. A second GWAS was applied to a large global set (n > 6400) to identify mutations associated with first- and secondline drug resistance. M. aurum is an environmental mycobacteria that has been proposed as a model for the development of anti-TB drugs. I have assembled and annotated its draft genome, and identified copy number variants in known drug resistance targets. Approximately 10% of the Mtb genome consists of two gene families (pe/ppe) that are poorly characterised, and are hypothesised to be important virulence factors. Using a de novo assembly approach, I characterised these genes and their diversity across a global collection of clinical isolates with high depth short-read sequence data (n=518). A follow-up study using a long-read sequence technology (n=18, diverse stain types) confirmed the findings. This work also generated new annotated reference genomes and characterised methylation sites, which may affect transmissibility, pathogenicity and virulence. A future direction of the TB genomics field is to identify genetic check points in host-pathogen interactions using both human and Mtb genotypes. I analysed the genomes of ~720 TB case–Mtb pairs and identified susceptibility markers, which are promising targets for future control measures.
Supervisor: Clark, T. G. ; Hibberd, M. I. ; Bhakta, S. Sponsor: Biotechnology and Biological Sciences Research Council
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral