Use this URL to cite or link to this record in EThOS: http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.557358
Title: Unsupervised learning of event and object classes from video
Author: Sridhar, Muralikrishna
Awarding Body: University of Leeds
Current Institution: University of Leeds
Date of Award: 2010
Availability of Full Text:
Access from EThOS:
Access from Institution:
Abstract:
We present a method for unsupervised learning of event classes from videos in which multiple activities may occur simultaneously. Unsupervised discovery of event classes avoids the need to hand-crafted event classes and thereby makes it possible in principle to scale-up to the huge number of event classes that occur in the real world. Research into an unsupervised approach has important consequences for tasks such as video understanding and summarization, modelling usual and unusual behaviour and video indexing for retrieval. These tasks are becoming increasingly important for scenarios such as surveillance, video search, robotic vision and sports highlights extraction as a consequence of the increasing proliferation of videos. The proposed approach is underpinned by a generative probabilistic model for events and a graphical representation for the qualitative spatial relationships between objects and their temporal evolution. Given a set of tracks for the objects within a scene, a set of event classes is derived from the most likely decomposition of the ‘activity graph’ of spatio-temporal relationships between all pairs of objects into a set of labelled events involving subsets of these objects. The posterior probability of candidate solutions favours decompositions in which events of the same class have a similar relational structure, together with three other measures of well-formedness. A Markov Chain Monte Carlo (MCMC) procedure is used to efficiently search for the MAP solution. This search moves between possible decompositions of the activity graph into sets of unlabelled events and at each move adds a close to optimal labellings (for this decomposition) using spectral clustering. Experiments on simulated and real data show that the discovered event classes are often semantically meaningful and correspond well with ground-truth event classes assigned by hand. Event Learning is followed by learning of functional object categories. Equivalence classes of objects are discovered on the basis of their similar functional role in multiple event instantiations. Objects are represented in a multidimensional space that captures their functional role in all the events. Unsupervised learning in this space results in functional object-categories. Experiments in the domain of aircraft handling suggests that our spatio-temporal representation together with the learning techniques are a promising framework for learning functional object-categories from video.
Supervisor: Cohn, A. ; Hogg, D. Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID: uk.bl.ethos.557358  DOI: Not available
Share: