Use this URL to cite or link to this record in EThOS:
Title: A method for ontology and knowledge-base assisted text mining for diabetes discussion forum
Author: Issa, Ahmad
Awarding Body: University of Warwick
Current Institution: University of Warwick
Date of Award: 2015
Availability of Full Text:
Access from EThOS:
Access from Institution:
Social media offers researchers vast amount of unstructured text as a source to discover hidden knowledge and insights. However, social media poses new challenges to text mining and knowledge discovery due to its short length, temporal nature and informal language. In order to identify the main requirements for analysing unstructured text in social media, this research takes a case study of a large discussion forum in the diabetes domain. It then reviews and evaluates existing text mining methods for the requirements to analyse such a domain. Using domain background knowledge to bridge the semantic gap in traditional text mining methods was identified as a key requirement for analysing text in discussion forums. Existing ontology engineering methodologies encounter difficulties in deriving suitable domain knowledge with the appropriate breadth and depth in domain-specific concepts with a rich relationships structure. These limitations usually originate from a reliance on human domain experts. This research developed a novel semantic text mining method. It can identify the concepts and topics being discussed, the strength of the relationships between them and then display the emergent knowledge from a discussion forum. The derived method has a modular design that consists of three main components: The Ontology building Process, Semantic Annotation and Topic Identification, and Visualisation Tools. The ontology building process generates domain ontology quickly with little need for domain experts. The topic identification component utilises a hybrid system of domain ontology and a general knowledge base for text enrichment and annotation, while the visualisation methods of dynamic tag clouds and cooccurrence network for pattern discovery enable a flexible visualisation of these results and can help uncover hidden knowledge. Application of the derived text mining method within the case study helped identify trending topics in the forum and how they change over time. The derived method performed better in semantic annotation of the text compared to the other systems evaluated. The new text mining method appears to be “generalisable” to other domains than diabetes. Future study needs to confirm this ability and to evaluate its applicability to other types of social media text sources.
Supervisor: Not available Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available
Keywords: QA76 Electronic computers. Computer science. Computer software ; ZA Information resources