Use this URL to cite or link to this record in EThOS:
Title: Automatic generation of factual questions from video documentaries
Author: Skalban, Yvonne
Awarding Body: University of Wolverhampton
Current Institution: University of Wolverhampton
Date of Award: 2013
Availability of Full Text:
Access from EThOS:
Access from Institution:
Questioning sessions are an essential part of teachers’ daily instructional activities. Questions are used to assess students’ knowledge and comprehension and to promote learning. The manual creation of such learning material is a laborious and time-consuming task. Research in Natural Language Processing (NLP) has shown that Question Generation (QG) systems can be used to efficiently create high-quality learning materials to support teachers in their work and students in their learning process. A number of successful QG applications for education and training have been developed, but these focus mainly on supporting reading materials. However, digital technology is always evolving; there is an ever-growing amount of multimedia content available, and more and more delivery methods for audio-visual content are emerging and easily accessible. At the same time, research provides empirical evidence that multimedia use in the classroom has beneficial effects on student learning. Thus, there is a need to investigate whether QG systems can be used to assist teachers in creating assessment materials from these different types of media that are being employed in classrooms. This thesis serves to explore how NLP tools and techniques can be harnessed to generate questions from non-traditional learning materials, in particular videos. A QG framework which allows the generation of factual questions from video documentaries has been developed and a number of evaluations to analyse the quality of the produced questions have been performed. The developed framework uses several readily available NLP tools to generate questions from the subtitles accompanying a video documentary. The reason for choosing video vii documentaries is two-fold: firstly, they are frequently used by teachers and secondly, their factual nature lends itself well to question generation, as will be explained within the thesis. The questions generated by the framework can be used as a quick way of testing students’ comprehension of what they have learned from the documentary. As part of this research project, the characteristics of documentary videos and their subtitles were analysed and the methodology has been adapted to be able to exploit these characteristics. An evaluation of the system output by domain experts showed promising results but also revealed that generating even shallow questions is a task which is far from trivial. To this end, the evaluation and subsequent error analysis contribute to the literature by highlighting the challenges QG from documentary videos can face. In a user study, it was investigated whether questions generated automatically by the system developed as part of this thesis and a state-of-the-art system can successfully be used to assist multimedia-based learning. Using a novel evaluation methodology, the feasibility of using a QG system’s output as ‘pre-questions’ with different types of prequestions (text-based and with images) used was examined. The psychometric parameters of the automatically generated questions by the two systems and of those generated manually were compared. The results indicate that the presence of pre-questions (preferably with images) improves the performance of test-takers and they highlight that the psychometric parameters of the questions generated by the system are comparable if not better than those of the state-of-the-art system. In another experiment, the productivity of questions in terms of time taken to generate questions manually vs. time taken to post-edit system-generated questions was analysed. A viii post-editing tool which allows for the tracking of several statistics such as edit distance measures, editing time, etc, was used. The quality of questions before and after postediting was also analysed. Not only did the experiments provide quantitative data about automatically and manually generated questions, but qualitative data in the form of user feedback, which provides an insight into how users perceived the quality of questions, was also gathered.
Supervisor: Mitkov, Ruslan; Specia, Lucia; Ha, Le An Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available
Keywords: Natural Language Processing ; Question Generation ; Factual Questions ; Video ; Subtitles ; Multimedia teaching ; Assessment