Use this URL to cite or link to this record in EThOS:
Title: Incremental generative models for syntactic and semantic natural language processing
Author: Buys, Jan Moolman
ISNI:       0000 0004 6501 0920
Awarding Body: University of Oxford
Current Institution: University of Oxford
Date of Award: 2017
Availability of Full Text:
Access from EThOS:
Full text unavailable from EThOS. Please try the link below.
Access from Institution:
This thesis investigates the role of linguistically-motivated generative models of syntax and semantic structure in natural language processing (NLP). Syntactic well-formedness is crucial in language generation, but most statistical models do not account for the hierarchical structure of sentences. Many applications exhibiting natural language understanding rely on structured semantic representations to enable querying, inference and reasoning. Yet most semantic parsers produce domain-specific or inadequately expressive representations. We propose a series of generative transition-based models for dependency syntax which can be applied as both parsers and language models while being amenable to supervised or unsupervised learning. Two models are based on Markov assumptions commonly made in NLP: The first is a Bayesian model with hierarchical smoothing, the second is parameterised by feed-forward neural networks. The Bayesian model enables careful analysis of the structure of the conditioning contexts required for generative parsers, but the neural network is more accurate. As a language model the syntactic neural model outperforms both the Bayesian model and n-gram neural networks, pointing to the complementary nature of distributed and structured representations for syntactic prediction. We propose approximate inference methods based on particle filtering. The third model is parameterised by recurrent neural networks (RNNs), dropping the Markov assumptions. Exact inference with dynamic programming is made tractable here by simplifying the structure of the conditioning contexts. We then shift the focus to semantics and propose models for parsing sentences to labelled semantic graphs. We introduce a transition-based parser which incrementally predicts graph nodes (predicates) and edges (arguments). This approach is contrasted against predicting top-down graph traversals. RNNs and pointer networks are key components in approaching graph parsing as an incremental prediction problem. The RNN architecture is augmented to condition the model explicitly on the transition system configuration. We develop a robust parser for Minimal Recursion Semantics, a linguistically-expressive framework for compositional semantics which has previously been parsed only with grammar-based approaches. Our parser is much faster than the grammar-based model, while the same approach improves the accuracy of neural Abstract Meaning Representation parsing.
Supervisor: Blunsom, Phil Sponsor: Clarendon Fund ; Skye Foundation (South Africa)
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available
Keywords: Natural Language Processing ; Computer Science ; Machine Learning ; Computational Linguistics ; Parsing ; Deep Learning ; Generative Models ; Language Modelling ; Semantic Parsing ; Dependency Parsing ; Incremental Parsing