Use this URL to cite or link to this record in EThOS: http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.664581
Title: The application of constraint rules to data-driven parsing
Author: Jaf, Sardar
ISNI:       0000 0004 5364 3296
Awarding Body: University of Manchester
Current Institution: University of Manchester
Date of Award: 2015
Availability of Full Text:
Access through EThOS:
Access through Institution:
Abstract:
The process of determining the structural relationships between words in both natural and machine languages is known as parsing. Parsers are used as core components in a number of Natural Language Processing (NLP) applications such as online tutoring applications, dialogue-based systems and textual entailment systems. They have been used widely in the development of machine languages. In order to understand the way parsers work, we will investigate and describe a number of widely used parsing algorithms. These algorithms have been utilised in a range of different contexts such as dependency frameworks and phrase structure frameworks. We will investigate and describe some of the fundamental aspects of each of these frameworks, which can function in various ways including grammar-driven approaches and data-driven approaches. Grammar-driven approaches use a set of grammatical rules for determining the syntactic structures of sentences during parsing. Data-driven approaches use a set of parsed data to generate a parse model which is used for guiding the parser during the processing of new sentences. A number of state-of-the-art parsers have been developed that use such frameworks and approaches. We will briefly highlight some of these in this thesis. There are three specific important features that it is important to integrate into the development of parsers. These are efficiency, accuracy, and robustness. Efficiency is concerned with the use of as little time and computing resources as possible when processing natural language text. Accuracy involves maximising the correctness of the analyses that a parser produces. Robustness is a measure of a parser’s ability to cope with grammatically complex sentences and produce analyses of a large proportion of a set of sentences. In this thesis, we present a parser that can efficiently, accurately, and robustly parse a set of natural language sentences. Additionally, the implementation of the parser presented here allows for some trading-off between different levels of parsing performance. For example, some NLP applications may emphasise efficiency/robustness over accuracy while some other NLP systems may require a greater focus on accuracy. In dialogue-based systems, it may be preferable to produce a correct grammatical analysis of a question, rather than incorrectly analysing the grammatical structure of a question or quickly producing a grammatically incorrect answer for a question. Alternatively, it may be desirable that document translation systems translate a document into a different language quickly but less accurately, rather than slowly but highly accurately, because users may be able to correct grammatically incorrect sentences manually if necessary. The parser presented here is based on data-driven approaches but we will allow for the application of constraint rules to it in order to improve its performance.
Supervisor: Not available Sponsor: Qatar Research National Fund
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID: uk.bl.ethos.664581  DOI: Not available
Keywords: parsing ; natural language parsing ; dependency parsing ; data-driven parsing ; constraint rules ; constraint rules for parsing ; grammar extraction ; syntactic parsing ; parsing Arabic ; non-projective parsing
Share: