An engineering approach to knowledge acquisition by the interactive analysis of dictionary definitions
It has long been recognised that everyday dictionaries are a potential source of lexical and world knowledge of the type required by many Natural Language Processing (NLP) systems. This research presents a semi-automated approach to the extraction of rich semantic relationships from dictionary definitions. The definitions are taken from the recently published "Cambridge International Dictionary of English" (CIDE). The thesis illustrates how many of the innovative features of CIDE can be exploited during the knowledge acquisition process. The approach introduced in this thesis uses the LOLITA NLP system to extract and represent semantic relationships, along with a human operator to resolve the different forms of ambiguity which exist within dictionary definitions. Such a strategy combines the strengths of both participants in the acquisition process: automated procedures provide consistency in the construction of complex and inter-related semantic relationships, while the human participant can use his or her knowledge to determine the correct interpretation of a definition. This semi-automated strategy eliminates the weakness of many existing approaches because it guarantees feasibility and correctness: feasibility is ensured by exploiting LOLITA's existing NLP capabilities so that humans with minimal linguistic training can resolve the ambiguities within dictionary definitions; and correctness is ensured because incorrectly interpreted definitions can be manually eliminated. The feasibility and correctness of the solution is supported by the results of an evaluation which is presented in detail in the thesis.