Use this URL to cite or link to this record in EThOS:
Title: Building tag hierarchies based on co-occurrences and lexico-syntactic patterns
Author: Bin Moqhim, Fahad Ibrahim
ISNI:       0000 0004 7224 8435
Awarding Body: University of Southampton
Current Institution: University of Southampton
Date of Award: 2016
Availability of Full Text:
Access from EThOS:
Full text unavailable from EThOS. Please try the link below.
Access from Institution:
Knowledge structures, such as taxonomies, are key to the organization and management of Web content, but are expensive to build manually. In this thesis we explore the issues around automatically building effective tag hierarchies from folksonomies (collective social classifications), and propose changes to the state-of-the-art methods that improve their performance. These changes aim to tackle the “generality-popularity” tags problem, in that popularity is assumed (sometimes inaccurately) to be a proxy for generality, i.e. high-level taxonomic terms will occur more often than low-level ones. The effectiveness of this research is demonstrated in four experiments. The first experiment explores whether taxonomic tag pairs captured directly from users change the quality of constructed tag hierarchies. The second experiment examines the possibility of using personal tag relationships constructed by users to improve the accuracy of learned taxonomic tags. The third experiment demonstrates the potential of using lexico-syntactic patterns applied to a closed text corpus to improve the direction of automatically derived tag pairs in order to build higher quality tag hierarchies. The last experiment investigates the possibility of using an open knowledge repository instead of a closed knowledge resource to increase the tags coverage in any tag collection, and consequently the quality of learned tag hierarchies. The results of our experiments show that collecting taxonomic tag pairs increases the semantic quality of the tag hierarchy, but at the expense of expressivity, and with some degradation of user experience. Secondly, personal tag relationships can be used to improve the accuracy of constructed taxonomic tags, but with limited success if the personal tag relationships and the learned taxonomic tags are not extracted from the same tagging system. Finally, lexico-syntactic patterns applied to a closed large text corpus (e.g. Wikipedia) can be used to improve the accuracy of directions in relations constructed between tags by a generality-based approach to tag hierarchy construction, and this would be improved further if an open corpus (e.g. the Web) is used instead of a closed one, which consequently improves the quality of the learned tag hierarchies in terms of structure and semantics.
Supervisor: Millard, David Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available