Natural Language Generation (NLG) of discourse relations for different reading levels
This thesis describes original research in the field of Natural Language Generation (NLG). NLG is the subfield of artificial intelligence that is concerned with the automatic production of documents from underlying data. This thesis claims that an NLG system can generate more readable output texts by making appropriate choices at the discourse level. For instance, use shorter sentences and more common discourse cue phrases. The choices we investigated were selection and placement of cue phrases, ordering and punctuation. We investigated the effects of the choices on good readers and poor readers. The NLG system built for this research was called GIRL (Generator for Individual Reading Levels). GIRL is part of a literacy assessment application. It generates feedback reports about reading skills for adults with poor literacy. This research focussed on the microplanner. Microplanning transforms discourse representations from hierarchical tree structures into ordered lists of individual sentence structures. The key innovations in microplanning were new ways to represent discourse-level knowledge and new algorithms for making discourse-level decisions. Knowledge about how humans realise discourse relations was acquired from a corpus annotated with discourse relations. This was represented in GIRL's microplanner as constraint satisfaction problem (CSP) graphs. A CSP problem solver was incorporated into the microplanner to generate all "legal" ways for realising each input discourse relation. Knowledge about which discourse-level choices affect readability was obtained from pilot experiments and from psycholinguistics. It was represented as sets of rules for scoring solutions output from the CSP solver. GIRL's output was evaluated with thirty-eight users including both good readers and poor readers. We developed a methodology for an evaluation experiment that involved measuring reading speed and comprehension and eliciting judgements. The results, although not statistically significant, indicated that the algorithms produced more readable output and that the effect was greater on poor readers.