Use this URL to cite or link to this record in EThOS: http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.632347
Title: Semantic similarity framework for Thai conversational agents
Author: Osathanunkul, Khukrit
ISNI:       0000 0004 5360 6057
Awarding Body: Manchester Metropolitan University
Current Institution: Manchester Metropolitan University
Date of Award: 2014
Availability of Full Text:
Access from EThOS:
Access from Institution:
Abstract:
Conversational Agents integrate computational linguistics techniques and natural language to support human-like communication with complex computer systems. There are a number of applications in business, education and entertainment, including unmanned call centres, or as personal shopping or navigation assistants. Initial research has been performed on Conversational Agents in languages other than English. There has been no significant publication on Thai Conversational Agents. Moreover, no research has been conducted on supporting algorithms for Thai word similarity measures and Thai sentence similarity measures. Consequently, this thesis details the development of a novel Thai sentence semantic similarity measure that can be used to create a Thai Conversational Agent. This measure, Thai Sentence Semantic Similarity measure (TSTS) is inspired by the seminal English measure, Sentence Similarity based on Semantic Nets and Corpus Statistics (STASIS). A Thai sentence benchmark dataset, called 65 Thai Sentence pairs benchmark dataset (TSS-65), is also presented in this thesis for the evaluation of TSTS. The research starts with the development a simple Thai word similarity measure called TWSS. Additionally, a novel word measure called a Semantic Similarity Measure, based on a Lexical Chain Created from a Search Engine (LCSS), is also proposed using a search engine to create the knowledge base instead of WordNet. LCSS overcomes the problem that a prototype version of Thai Word semantic similarity measure (TWSS) has with the word pairs that are related to Thai culture. Thai word benchmark datasets are also presented for the evaluation of TWSS and LCSS called the 30 Thai Word Pair benchmark dataset (TWS-30) and 65 Thai Word Pair benchmark dataset (TWS-65), respectively. The result of TSTS is considered a starting point for a Thai sentence measure which can be illustrated to create semantic-based Conversational Agents in future. This is illustrated using a small sample of real English Conversational Agent human dialogue utterances translated into Thai.
Supervisor: Not available Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID: uk.bl.ethos.632347  DOI: Not available
Share: