Use this URL to cite or link to this record in EThOS:
Title: Integrating and querying semantic annotations
Author: Chen, Luying
ISNI:       0000 0004 5349 4227
Awarding Body: University of Oxford
Current Institution: University of Oxford
Date of Award: 2014
Availability of Full Text:
Access from EThOS:
Full text unavailable from EThOS. Restricted access.
Access from Institution:
Semantic annotations are crucial components in turning unstructured text into more meaningful and machine-understandable information. The acquisition of the mass of semantically-enriched information would allow applications that consume the information to gain wide benefits. At present there are a plethora of commercial and open-source services or tools for enriching documents with semantic annotations. Since there has been limited effort to compare such annotators, this study first surveys and compares them in multiple dimensions, including the techniques, the coverage and the quality of annotations. The overlap and the diversity in capabilities of annotators motivate the need of semantic annotation integration: middleware that produces a unified annotation with improved quality on top of diverse semantic annotators. The integration of semantic annotations leads to new challenges, both compared to usual data integration scenarios and to standard aggregation of machine learning tools. A set of approaches to these challenges are proposed that perform ontology-aware aggregation, adapting Maximum Entropy Markov models to the setting of ontology-based annotations. These approaches are further compared with the existing ontology-unaware supervised approaches, ontology-aware unsupervised methods and individual annotators, demonstrating their effectiveness by an overall improvement in all the testing scenarios. A middleware system – ROSeAnn and its corresponding APIs have been developed. In addition, this study also concerns the availability and usability of semantic-rich data. Thus the second focus of this thesis aims to allow users to query text annotated with different annotators by using both explicit and implicit knowledge. We describe our first step towards this, a query language and a prototype system – QUASAR that provides a uniform way to query multiple facets of annotated documents. We will show how integrating semantic annotations and utilizing external knowledge help in increasing the quality of query answers over annotated documents.
Supervisor: Benedikt, Michael Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available
Keywords: Computing ; Applications and algorithms ; Information extraction ; machine learning