Use this URL to cite or link to this record in EThOS:
Title: Improved genomic assembly and genomic analyses of Entamoeba histolytica
Author: Leckenby, A. E.
ISNI:       0000 0004 7964 1408
Awarding Body: University of Liverpool
Current Institution: University of Liverpool
Date of Award: 2018
Availability of Full Text:
Access from EThOS:
Access from Institution:
Amoebiasis is the third most common cause of mortality worldwide from a parasitic infection. It affects up to 50 million people annually, of whom 100,000 will die from the disease each year. Amoebiasis is caused by the amoeba Entamoeba histolytica, an obligate parasite of humans. Our understanding of the biology of this pathogen has been greatly advanced by the sequencing of its genome. However, the unusual nature of the genome (an extreme nucleotide composition bias, abundant repetitive elements and unknown chromosome structures/ploidy) made it particularly challenging to sequence and the resulting reference genome assembly is highly fragmented and possibly incomplete, limiting its usefulness for some analyses. New sequencing technologies can overcome some of the problems of the previous genome assembly. Here, single molecule real time (SMRT) sequencing was applied to sequence long fragments of DNA and build an improved reference genome for E. histolytica. This thesis describes the generation of sequence data and a comprehensive comparative analysis of genome assembly tools available for long-read SMRT sequencing data. This analysis showed that assembly using PacBio data only produced better quality genome assemblies than hybrid assembly approaches utilising both long- and short-read data together. The PacBio genome assembly is significantly better than the published reference genome assembly based on a range of quality metrics. The new genome assembly was annotated, revealing an increase in gene number. The spatial organisation of key virulence gene families (AIG1, Ariel-1, BspA, cysteine proteases, Gal/GalNAc lectins and STIRP families) was analysed, revealing an association of virulence gene families with transposable elements. The new assembly allowed analyses of two key, unusual features of the E. histolytica genome: the long arrays of multiple tRNA genes and the multi-copy, extra-chromosomal molecules containing the ribosomal DNA. Several lines of evidence were consistent with tRNA arrays capping chromosomes and acting as telomeres in Entamoeba. Variation among array units exists (relevant as they are used as population genetic markers), but the majority sequence was consistently retrieved when genotyping, suggesting they may be relatively robust markers. Analysis of the rDNA episomes present in the E. histolytica strain sequenced (the HM-1:IMSS strain used for previous whole genome sequencing) revealed that one of the two rDNA episome types described in this strain has apparently been lost during in vitro culture. Genome-wide 5-methylcytosine methylation profiles for trophozoite stage parasites in culture were determined using bisulphite sequencing for the new E. histolytica genome assembly and two additional species (Entamoeba moshkovskii and Entamoeba invadens). The analyses confirmed previous reports of sparse methylation of the genome as a whole but highlighted interesting patterns of methylation. While there was virtually no methylation of genes, there was extensive methylation of transposable elements and tRNA arrays. These patterns suggest methylation functions to suppress active transposition and may play a role in the structural control of tRNA arrays, again consistent with telomeric role. The work presented here improves our understanding of the structure, content and regulation of the E. histolytica genome and provides a platform for improved future analyses for the Entamoeba research community.
Supervisor: Hertz-Fowler, Christiane ; Weedall, Gareth ; Hall, neil Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral