STORM : an unsupervised connectionist model for language acquisition
Language acquisition is one of the core problems in artificial intelligence. Current
performance bottlenecks in natural language processing (NLP) systems result from a
prerequisite for an incalculable amount oflanguage and domain-specific knowledge.
Consequently, the creation of an automated language acquisition system would revolutionize
the field of NLP. Connectionist models that learn by example (i.e. artificial neural networks)
have been successfully applied to many areas of language acquisition. However, the most
widely used class of these models, known as supervised connectionist models, have a number
of major limitations, including an inability to represent variables and a limited ability to
generalize from sparse data. Such limitations have prevented connectionist models from being
applied to large-scale language acquisition.
This research considers the alternative and less widely used class of unsupervised
connectionist models and investigates whether such models can capture the finite-state
properties of language. A novel unsupervised connectionist model, STORM (Spatio Temporal
Self-Organizing Recurrent Map), is proposed that uses a memory-rule based approach to learn
a regular grammar from a set of positive example sequences. STORM's learning algorithm
uses a derivation of functional-equivalence theory that allows the model to learn via similarity
of behaviour, rather than just similar of form. This novel functional generalization ability
allows STORM to learn a perfect and stable representation of the Reber grammar from a
sparse training set of just 30 sequences, as opposed to the 60,000 sequences required to train a
supervised connectionist model. Unlike supervised models, once STORM has learnt the
grammar it can generalize to test sequences of any length or depth of embedding.
Extensions to the model are proposed to show how STORM can learn context-free grammars.
These extensions also solve the logical problem of language acquisition by recovering from
overgeneralizations without the need for negative evidence.