Personalised information filtering using event causality
Previous research on multimedia information filtering has mainly concentrated on key frame identification and video skim generation for browsing purposes, however applications requiring the generation of summaries as the final product for user con- sumption are of equal scientific and commercial interest. Recent advances in computer vision have enabled the extraction of semantic events from an audio-visual signal, so it can be assumed for our purposes that such semantic labels are already available for use. We concentrate instead on developing methods to prioritise these semantic elements for inclusion in a summary which can be personalised to meet a particular user's needs. Our work differentiates itself from that in the literature as it is driven by the results of a knowledge elicitation study with expert summarisers. The experts in our study believe that summaries structured as a narrative are better able to convey the content of the original data to a user. Motivated by the information filtering problem, the primary contribution of this thesis is the design and implementation of a system to summarise sequences of events by automatic modelling of the causal relationships between them. We show, by com- parison against summaries generated by experts and with the introduction of a new coherence metric, that modelling the causal relationships between events increases the coherence and accuracy of summaries. We suggest that this claim is valid, not only in the domain of soccer highlights generation, in which we carry out the bulk of our experiments, but also in any other domain in which causal relationships can be iden- tified between events. This proposal is tested by applying our summarisation system to another, significantly different domain, that of business meeting summarisation, using the soccer training set and a manually generated ontology mapping. We introduce the concept of a context-group of causally related events as a first step towards modelling narrative episodes and present a comparison between a case based reasoning and a two-stage Markov model approach to summarisation. For both methods we show that by including entire context-groups in the summary, rather than single events in isolation, more accurate summaries can be generated. Our approach to personalisation biases a summary according to particular narrative plotlines using different subsets of the training data. Results show that the number of instances of certain event classes can be increased by biasing the training set appropriately. This method gives very similar results to a standard weighting method, while avoiding the need to tailor the weights to a particular application domain.