Use this URL to cite or link to this record in EThOS:
Title: Human activity recognition using a wearable camera
Author: Tadesse, Girmaw Abebe
ISNI:       0000 0004 7653 7365
Awarding Body: Queen Mary University of London
Current Institution: Queen Mary, University of London
Date of Award: 2018
Availability of Full Text:
Access from EThOS:
Access from Institution:
Advances in wearable technologies are facilitating the understanding of human activities using first-person vision (FPV) for a wide range of assistive applications. In this thesis, we propose robust multiple motion features for human activity recognition from first-person videos. The proposed features encode discriminant characteristics from magnitude, direction and dynamics of motion estimated using optical flow. Moreover, we design novel virtual-inertial features from video, without using the actual inertial sensor, from the movement of intensity centroid across frames. Results on multiple datasets demonstrate that centroid-based inertial features improve the recognition performance of grid-based features. Moreover, we propose a multi-layer modelling framework that encodes hierarchical and temporal relationships among activities. The first layer operates on groups of features that effectively encode motion dynamics and temporal variations of intra-frame appearance descriptors of activities with a hierarchical topology. The second layer exploits the temporal context by weighting the outputs of the hierarchy during modelling. In addition, a post-decoding smoothing technique utilises decisions on past samples based on the confidence of the current sample. We validate the proposed framework with several classifiers, and the temporal modelling is shown to improve recognition performance. We also investigate the use of deep networks to simplify the feature engineering from firstperson videos. We propose a stacking of spectrograms to represent short-term global motions that contains a frequency-time representation of multiple motion components. This enables us to apply 2D convolutions to extract/learn motion features. We employ long short-term memory recurrent network to encode long-term temporal dependency among activities. Furthermore, we apply cross-domain knowledge transfer between inertial-based and vision-based approaches for egocentric activity recognition. We propose sparsity weighted combination of information from different motion modalities and/or streams. Results show that the proposed approach performs competitively with existing deep frameworks, moreover, with reduced complexity.
Supervisor: Not available Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available
Keywords: Electronic Engineering and Computer Science ; Interactive and Cognitive Environments ; wearable technologies ; first-person vision