Use this URL to cite or link to this record in EThOS:
Title: Event modelling and recognition in video
Author: Gkalelis, Nikolaos
Awarding Body: Imperial College London
Current Institution: Imperial College London
Date of Award: 2013
Availability of Full Text:
Access from EThOS:
Access from Institution:
The management of digital video has become a very challenging problem as the amount of video content continues to witness phenomenal growth. This trend necessitates the development of advanced techniques for the efficient and effective manipulation of video information. However, the performance of current video processing tools has not yet reached the required satisfaction levels mainly due to the gap between the computer generated semantic descriptions of video content and the interpretations of the same content by humans, a discrepancy commonly referred to as the semantic gap. Inspired from recent studies in neuroscience suggesting that humans remember real life using past experience structured in events, in this thesis we investigate the use of appropriate models and machine learning approaches for representing and recognizing events in video. Specifically, a joint content-event model is proposed for describing video content (e.g., shots, scenes, etc.), as well as real-life events (e.g., demonstration, birthday party, etc.) and their key semantic entities (participants, location, etc.). In the core of this model stands a referencing mechanism which utilizes a set of video analysis algorithms for the automatic generation of event model instances and their enrichment with semantic information extracted from the video content. In particular, a set of subclass discriminant analysis and support vector machine methods for handling data nonlinearities and addressing several limitations of the current state-of-the-art approaches are proposed. These approaches are evaluated using several publicly available benchmarks particularly suited for testing the robustness and reliability of nonlinear classification methods, such as the facial image collection of the Four Face database, datasets from the UCI repository, and other. Moreover, the most efficient of the proposed methods are additionally evaluated using a large-scale video collection, consisting of the datasets provided in TRECVID multimedia event detection (MED) track of 2010 and 2011, which are among the most challenging in this field, for the tasks of event detection and event recounting. This experiment is designed in such a manner so that it can be conceived as a fundamental evaluation of the proposed joint content-event model.
Supervisor: Stathaki, Tania Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available