Use this URL to cite or link to this record in EThOS:
Title: Interpreting document collections with topic models
Author: Aletras, Nikolaos
ISNI:       0000 0004 5356 5590
Awarding Body: University of Sheffield
Current Institution: University of Sheffield
Date of Award: 2014
Availability of Full Text:
Access from EThOS:
Access from Institution:
This thesis concerns topic models, a set of statistical methods for interpreting the contents of document collections. These models automatically learn sets of topics from words frequently co-occurring in documents. Topics learned often represent abstract thematic subjects, i.e Sports or Politics. Topics are also associated with relevant documents. These characteristics make topic models a useful tool for organising large digital libraries. Hence, these methods have been used to develop browsing systems allowing users to navigate through and identify relevant information in document collections by providing users with sets of topics that contain relevant documents. First, we look at the problem of identifying incoherent topics. We show that our methods work better than previously proposed approaches. Next, we propose novel methods for efficiently identifying semantically related topics which can be used for topic recommendation. Finally, we look at the problem of alternative topic representations to topic keywords. We propose approaches that provide textual or image labels which assist in topic interpretability. We also compare different topic representations within a document browsing system.
Supervisor: Stevenson, Mark Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available