Analysis of host and herpesvirus interactions using bioinformatics
Bioinformatics methods have become central to analysing and organising the sequence data continually produced by new and existing sequencing projects. The field of bioinformatics covers both the static aspects of organising and presenting these raw data, by compiling existing knowledge into accessible databases, ontologies, and libraries; and the more dynamic aspects of knowledge discovery informatics for interpreting and mining existing data. The aim of this thesis is to utilise such methods to analyse the herpesvirus-host relationship. In Chapter 2 comparative host and herpesvirus genome analysis is used to compare the sequences of all currently sequenced herpesvirus open reading frames to the conceptually translated human genome with the aim of identifying herpesvirus-human (host) sequence homologues. Collating in one search all currently known host homologues provides the first complete assessment of herpesvirus-host homologues. This search identified 55 previously identified herpesvirus-host homologues, and 4 previously unknown herpesvirus-host homologues. The work performed in Chapter 2 highlighted the need for consistent annotation of genomes and gene products to allow greater comparative genomics. It is not feasible to manually curate large numbers of genes whose relationships to each other are not immediately clear. Therefore, Chapters 3 and 4 focus upon the use of the Gene Ontology; a resource that is made publicly available for the purpose of annotating gene products with unified vocabulary derived from a structured directed acyclic graph. The Gene Ontology was extended to allow host-pathogen interaction annotation by a) adding 187 new terms relating specifically to virus function and structure (Chapter 3), and b) using both the new and existing terms to annotate the entire Human Herpesvirus 1 genome using references from the available literature (Chapter 4). Finally, Chapter 5 examines the utility of the Gene Ontology when analysing such large-scale host and herpesvirus gene expression datasets as produced experimentally by DNA microarray studies. Using such automated annotation methods a cluster of 12 proteins were identified that increase mitochondrial function in HUVEC cells 24 hours post HCMV infection. A cluster of nine proteins that function in the MAPK pathway were also identified using the Gene Ontology that provide evidence for HCMV inhibition of the MAPK pathway.