Use this URL to cite or link to this record in EThOS:
Title: Discourse cohesion in Chinese-English statistical machine translation
Author: Steele, David
ISNI:       0000 0004 8504 743X
Awarding Body: University of Sheffield
Current Institution: University of Sheffield
Date of Award: 2019
Availability of Full Text:
Access from EThOS:
Access from Institution:
In discourse, cohesion is a required component of meaningful and well organised text. It establishes the relationship between different elements in the text using a number of devices such as pronouns, determiners, and conjunctions. In translation a well translated document will display the correct cohesion and use of cohesive devices that are pertinent to the language. However, not all languages have the same cohesive devices or use them in the same way. In statistical machine translation this is a particular barrier to generating smooth translations, especially when sentences in parallel corpora are being treated in isolation and no extra meaning or cohesive context is provided beyond the sentential level. In this thesis, focussing on Chinese 1 and English as the language pair, we examine discourse cohesion in statistical machine translation looking at ways that systems can leverage discourse cues and signals in order to produce smoother translations. We also provide a statistical model that improves translation output by adding additional tokens within text that can be used to leverage extra information. A significant part of this research involved visualising many of the results and system outputs, and so an overview of two important pieces of visualisation software that we developed is also included.
Supervisor: Specia, Lucia Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available