Use this URL to cite or link to this record in EThOS:
Title: Weakly supervised sentiment analysis and opinion extraction
Author: Angelidis, Stefanos
ISNI:       0000 0004 7969 354X
Awarding Body: University of Edinburgh
Current Institution: University of Edinburgh
Date of Award: 2019
Availability of Full Text:
Access from EThOS:
Full text unavailable from EThOS. Please try the link below.
Access from Institution:
In recent years, online reviews have become the foremost medium for users to express their satisfaction, or lack thereof, about products and services. The proliferation of user-generated reviews, combined with the rapid growth of e-commerce, results in vast amounts of opinionated text becoming available to consumers, manufacturers, and researchers alike. This has fuelled an increased focus on automated methods that attempt to discover, analyze, and distill opinions found in text. This thesis tackles the tasks of fine-grained sentiment analysis and aspect extraction, and presents a unified framework for the summarization of opinions from multiple user reviews. Two core concepts form the basis of our methodology. Firstly, the use of neural networks, whose ability to learn continuous feature representations from data, without recourse to preprocessing tools or linguistic annotations, has advanced the state-of-the-art of numerous Natural Language Processing tasks. Secondly, our belief that opinion mining systems applied to real-life applications cannot rely on expensive human annotations and should mostly take advantage of freely available review data. Specifically, the main contributions of this thesis are: (i) The creation of OPOSUM, a new Opinion Summarization corpus which contains over one million reviews from multiple domains. To test our methods, we annotated a subset of the data with fine-grained sentiment and aspect labels, as well as extractive gold-standard opinion summaries. (ii) The development of two weakly-supervised hierarchical neural models for the detection and extraction of sentiment-heavy expressions in reviews. Our first model composes segment representations hierarchically and uses an attention mechanism to differentiate between opinions and neutral statements. Our second model is based on Multiple Instance Learning (MIL), and can detect user opinions of potentially opposing polarity. Experiments demonstrate significant benefits from our MIL-based architecture. (iii) The introduction of a neural model for aspect extraction, which requires minimal human involvement. Our proposed formulation uses aspect keywords to help the model target specific aspects, and a multi-tasking objective to further improve its accuracy. (iv) A unified summarization framework which combines our sentiment and aspect detection methods, while taking redundancy into account to produce useful opinion summaries from multiple reviews. Automatic evaluation, on our opinion summarization dataset, shows significant improvements over other summarization systems in terms of extraction accuracy and similarity to reference summaries. A large-scale judgement elicitation study indicates that our summaries are also preferred by human judges.
Supervisor: Lapata, Maria ; Sutton, Charles Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available
Keywords: natural language processing ; sentiment analysis ; summarization ; opinion extraction ; text analysis ; neural networks ; machine learning