Use this URL to cite or link to this record in EThOS:
Title: Investigating “Gene Ontology”- based semantic similarity in the context of functional genomics
Author: Welter, Danielle
ISNI:       0000 0004 2732 0493
Awarding Body: Cardiff University
Current Institution: Cardiff University
Date of Award: 2011
Availability of Full Text:
Access from EThOS:
Access from Institution:
Gene functional annotations are an essential part of knowledge discovery in the analysis of large datasets, with the Gene Ontology [Ashburner et al., 2000] as the de facto standard for such annotations. A considerable number of approaches for quantifying functional similarity between gene products based on the semantic similarity between their annotations have been developed, but little guidance exists as to which of these measures are the most appropriate for different purposes. This was addressed here by comparing the performances of a number of similarity measures and associated parameters. This comparison provided some interesting new insights as well as confirming emerging trends from the literature. There is also a pressing need for novel ways of applying these measures to facilitate the functional analysis of lists of gene products. We developed a novel algorithm, FuSiGroups, to group GO terms based on their semantic similarity and genes based on their functional similarity. This two-fold grouping results in groups of not only functionally similar genes but also an associated set of related GO terms that characterise a single functional aspect relating the genes in the group, which facilitates analysis by creating more coherent groups. Each gene can belong to multiple groups, so the groups more accurately reflect the complexity of biological reality than clusters generated using traditional approaches. FuSiGroups was tested on a number of scenarios and in each case, successfully generated biologically relevant groups, identifying the key functional aspects of the dataset. The algorithm also managed to eliminate genes that were functionally unrelated to the bulk of the dataset and distinguish between different biological pathways. Although dataset size is currently a limiting factor, with smaller datasets performing the best, FuSiGroups has been demonstrated as a promising approach for the functional analysis of gene products.
Supervisor: Not available Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available