Use this URL to cite or link to this record in EThOS:
Title: Development of software framework for the integration of metagenomics with clinical and metadata
Author: Koci, Orges
ISNI:       0000 0004 9349 7908
Awarding Body: University of Glasgow
Current Institution: University of Glasgow
Date of Award: 2020
Availability of Full Text:
Access from EThOS:
Full text unavailable from EThOS. Thesis embargoed until 01 May 2023
Access from Institution:
The past few years have seen an increased utility of shotgun metagenomics for microbial community surveys over traditional amplicon sequencing. This is made possible by the technological advancement in methods development that enables us now to assemble short sequence reads into longer contiguous regions that can be binned together to identify species they are part of (e.g., through CONCOCT software), and their coding regions can further be annotated against public databases to give an assessment of functional diversity. At the same time, integrated solutions are gaining importance through complementing meta’omics technologies. To consolidate all these realisations on the same sample space, and to fully delineate microbial activity response to environmental factors, it is necessary to include and integrate all levels of gene products, mRNA, protein, metabolites, as well as their interactions in a single platform. Hence, in this thesis, we explore a set of statistical analyses, and introduce CViewer, a Java-based software, that integrates with output data from CONCOCT as well as major third party taxonomic and annotation software. The software provides a comprehen-sive set of multivariate statistical algorithms using the theoretical underpinning of numerical ecology to allow exploratory as well as hypothesis driven analyses, emphasizing functional traits of microbial communities and phylogenetic‐based approaches to community assembly, particularly abiotic filtering. The end result is a highly interactive toolkit with multiple document interface, that makes it easier to unravel useful patterns through Point-and-Click tools whether it is looking at annotated tracks of metagenomic contigs, or exploring enrichments of metabolic pathways and microbial species. As a proof-of-concept, we have used CViewer to explore two independent data sets: a longitudinal gut microbiome profile of children who have Crohn’s disease to unravel its aetiology through dietary intervention targeting the gut microbes; as well as gut microbiome profile for an obesity dataset comparing subjects who are naturally and/or pathologically obese against those who are lean. In addition to analysing the sequencing data, we have developed pyTag, a text-mining tool to investigate literature related to Inflammatory Bowel Diseases (IBDs), with the aim of supplementing genomic exploration with associated textual data available in public repositories, for example, PubMed. This is particularly useful, say, if the meta-genomics data is available for studying obesity, then pyTag can get temporal profiles in terms of ontologies (dictionary of specific terms, related to environment, disease, chemical com-pounds, tissue information etc.) for all the papers (PubMed abstracts) that were published and categorized under “obesity”. This provides an additional context to data analysis. However, in this thesis, we have tested pyTag in the context of common IBDs, including Crohn’s disease, Ulcerative Colitis, Coeliac disease and Irritable Bowel Syndrome, to provide spatial and temporal trends.
Supervisor: Not available Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available
Keywords: QR Microbiology ; R Medicine (General) ; T Technology (General)