Use this URL to cite or link to this record in EThOS:
Title: Pushing the envelope of sentiment analysis beyond words and polarities
Author: Williams, Lowri
ISNI:       0000 0004 7229 3279
Awarding Body: Cardiff University
Current Institution: Cardiff University
Date of Award: 2017
Availability of Full Text:
Access from EThOS:
Access from Institution:
Idioms are multi-word expressions which hold a literal and figurative meaning which is conventionally understood by native speakers. Their overall meaning, often, cannot be deduced from the literal meaning of their constituent words. Sentiment analysis, also referred to as opinion mining, aims to automatically extract and classify sentiments, opinions, and emotions expressed in text. The research in this thesis is motivated by the fact that idioms, which often express an affective stance towards an entity or an event, are not featured systematically in sentiment analysis. To estimate the degree to which the inclusion of idioms as features may improve the results of traditional sentiment analysis, we compared our results to two state-of-the-art sentiment analysis approaches. Firstly, we collected a set of idioms that are relevant to sentiment analysis, i.e. those that can be mapped to an emotion. These mappings were obtained using a crowdsourcing approach. Secondly, to evaluate the results of sentiment analysis, we assembled a corpus of sentences in which idioms are used in context. Each sentence was annotated with an emotion, which formed the basis for the gold standard used for the comparison against the baseline methods. The classification performance was improved by almost 20 percentage points. Given the positive findings from our initial experiments, the main limitation was the significant knowledge-engineering overhead involved in hand-crafting lexico-semantic resources used to support idiom-based features. To minimise the bottleneck associated with the acquisition of such resources, we scaled up our original approach by automating their engineering. Subsequently, these resources were used to replace the manually engineered counterparts of such features in the originally proposed method. The fully automated approach outperformed the two baseline methods by 7 and 9 percentage points. These improvements, however, were poorer in comparison to those achieved in the initial study. Nevertheless, we have demonstrated, not only can idiom-based features be automatically engineered, but they too, improve sentiment classification results, when such features are present. Taking a long-term view of the research in this thesis, we want to address the limitations of state-of-the-art sentiment analysis approaches by focusing on a full range of emotions, rather than sentiment polarity. However, there is no consensus among researchers on a standardised framework for classifying emotions. Proposing such a framework would be a major contribution to the field of sentiment analysis, as it would stimulate its evolution into fully-fledged emotion classification and allow for systematic comparison of independent studies. With this goal in mind, we investigated the utility of different classification frameworks for sentiment analysis. A comprehensive statistical analysis of our experimental results provided explicit evidence that, in relative terms, six basic emotions are best suited for sentiment analysis. However, we identified the major shortcoming of oversimplifying positive emotions.
Supervisor: Not available Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available