Use this URL to cite or link to this record in EThOS:
Title: An investigation into the cross-linguistic robustness of textual equivalence techniques
Author: Alshahrani, Amal
ISNI:       0000 0004 7971 1170
Awarding Body: University of Manchester
Current Institution: University of Manchester
Date of Award: 2018
Availability of Full Text:
Access from EThOS:
Access from Institution:
This thesis explores a range of techniques that have been applied to the task of Textual Equivalence (TEQV), i.e., identifying whether one text snippet is equivalent to another. This task has been widely explored for English texts. In this study we investigate and analyse the extent to which these techniques generalise to other languages, in particular Arabic. Written Arabic is widely said to be more ambiguous than English. This ambiguity makes determining the relationships between text snippets particularly challenging. We have tried to use these techniques in settings which are as similar as possible so that any differences that appear in the experimental results can be reliably attributed to differences between the two languages, rather than to differences in the experimental set-up. In particular the dynamic time warping (DTW) algorithm has been used to measure the similarity between sentence pairs by calculating the minimum number of editing operations (Insert, Delete, Exchange) which are required to convert one sentence to another. Also WordNet similarity measures have been used as a cost function for the Exchange operation. This algorithm has been extended with an extra operation, Swap, which allows for local permutations to compensate for the comparatively free word order of Arabic. The outcome is that when we extend the coverage of Arabic WordNet we obtain similar results to the use of English WordNet for TEQV for English; and that using the extended version of DTW provides more benefits for Arabic than for English.
Supervisor: Ramsay, Allan Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available
Keywords: Arabic ; paraphrasing ; Textual equivalence