Use this URL to cite or link to this record in EThOS:
Title: Supervised dictionary learning for action recognition and localization
Author: Kumar, B. G. Vijay
Awarding Body: Queen Mary, University of London
Current Institution: Queen Mary, University of London
Date of Award: 2012
Availability of Full Text:
Access from EThOS:
Access from Institution:
Image sequences with humans and human activities are everywhere. With the amount of produced and distributed data increasing at an unprecedented rate, there has been a lot of interest in building systems that can understand and interpret the visual data, and in particular detect and recognise human actions. Dictionary based approaches learn a dictionary from descriptors extracted from the videos in the first stage and a classifier or a detector in the second stage. The major drawback of such an approach is that the dictionary is learned in an unsupervised manner without considering the task (classification or detection) that follows it. In this work we develop task dependent(supervised) dictionaries for action recognition and localization, i.e., dictionaries that are best suited for the subsequent task. In the first part of the work, we propose a supervised max-margin framework for linear and non-linear Non-Negative Matrix Factorization (NMF). To achieve this, we impose max-margin constraints within the formulation of NMF and simultaneously solve for the classifier and the dictionary. The dictionary (basis matrix) thus obtained maximizes the margin of the classifier in the low dimensional space (in the linear case) or in the high dimensional feature space (in the non-linear case). In the second part the work, we develop methodologies for action localization. We first propose a dictionary weighting approach where we learn local and global weights for the dictionary by considering the localization information of the training sequences. We next extend this approach to learn a task-dependent dictionary for action localization that incorporates the localization information of the training sequences into dictionary learning. The results on publicly available datasets show that the performance of the system is improved by using the supervised information while learning dictionary.
Supervisor: Not available Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available
Keywords: Electronic Engineering ; Video surveillance