Use this URL to cite or link to this record in EThOS: http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.487602
Title: Tagging and parsing Icelandic text
Author: Loftsson, Hrafn
Awarding Body: University of Sheffield
Current Institution: University of Sheffield
Date of Award: 2007
Availability of Full Text:
Access from EThOS:
Abstract:
~aturallanguageprocessing (~LP) is a very young discipline in Iceland. Therefore, there is a lack of publicly available basic tools for processing the morphologically complex Icelandic language. III this thesis, we investigate the effectiveness and viability of using (mainly) rule-based methods for analysing the synta.x of Icelandic text. For this purpose, and because our work has a practical focus, we develop a ~LP toolkit, IceNLP. The toolkit consists of a tokeniser, the morphological analyser IceMorphy, the part-ofspeech tagger IceTagger', and the shallow parser IcePan;er'. The task of the tokeniser is to split a sequence of characters into linguistic units and identify where one sentence ends and another one begins. IceMorphy is used for guessing part-of-speech tags for unknown words and filling in tag profile gaps ill a dictionary. Ice Tagger' is a linguistic rule-based tagger which achieves considerably higher tagging accuracy than previously reported results using taggers based on datadriven techniques. Furthermore, by using several tagger integration and combination methods. we increase substantially the tagging accuracy of Icelandic text, with regard to previous work. Our shallow parser, IceParser, is an incremental finite-state parser, the first parser puulished for the Icelandic language. It produces shallow syntactic annotation, using an annotation scheme specifically developed in this work. Furthermore, we create a grammar definition corpus, a representative collection of sentences annotated using the annotation scheme. The development of our toolkit is a step towards the goal of building a Basic Language Resource Kit (BLARK) for the Icelandic language. Our toolkit has been made available for use in the research community, and should therefore encourage further research and development of XLP tools.
Supervisor: Not available Sponsor: Not available
Qualification Name: University of Sheffield, The Department of Computer Science, 2007 Qualification Level: Doctoral
EThOS ID: uk.bl.ethos.487602  DOI: Not available
Share: