Use this URL to cite or link to this record in EThOS:
Title: Example-based methods for natural language processing with applications to machine translation and preposition correction
Author: Smith, James Sullivan
Awarding Body: University of Oxford
Current Institution: University of Oxford
Date of Award: 2012
Availability of Full Text:
Full text unavailable from EThOS.
Please contact the current institution’s library for further details.
We investigate the use of example-based methods for Natural Language Processing tasks. Specifi- cally, we look at machine translation and preposition prediction. We propose a new framework for the hybridisation of Example-Based and Statistical Machine Translation (EBMT and SMT) systems. We add powerful new functionality to the Moses SMT system to allow it to work effectively with our EBMT system. Within this framework, we investigate the use of two types of EBMT system. We first create an EBMT system which uses string-based matching and evaluate it within the hybrid framework. We investigate several variations, but find that the hybrid system is unable to match the performance of the pure SMT system. We next created a syntax-based EBMT system which uses dependency trees to compare inputs to the example base, and show that this system is consistently better than the string-based approach. We find that while the SMT system still performs better overall, the syntax-based hybrid does perform particularly well for some examples. We then look at the application of example-based methods to preposition prediction for non- native English writers. We create two systems, one of which is syntax-based, and the other string- based. The syntax-based system again uses dependency information to make predictions, and we show that it performs with a very high precision but a low recall. The string-based system uses n-gram counts to make preposition predictions. We show that this approach is simple and fast, and performs as well as or better than other leading systems in the field. We conclude that example-based techniques continue to yield impressive results for NLP tasks, and expect the field to benefit further as computing and data resources develop.
Supervisor: Clark, Stephen Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available