Use this URL to cite or link to this record in EThOS: https://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.759793
Title: Hierarchical modelling and recognition of activities of daily living
Author: Tayyub, Jawad
ISNI:       0000 0004 7431 8126
Awarding Body: University of Leeds
Current Institution: University of Leeds
Date of Award: 2018
Availability of Full Text:
Access from EThOS:
Access from Institution:
Abstract:
Activity recognition is becoming an increasingly important task in artificial intelligence. Successful activity recognition systems must be able to model and recognise activities ranging from simple short activities spanning a few seconds to complex longer activities spanning minutes or hours. We define activities as a set of qualitatively interesting interactions between people, objects and the environment. Accurate activity recognition is a desirable task in many scenarios such as surveillance, smart environments, robotic vision etc. In the domain of robotic vision specifically, there is now an increasing interest in autonomous robots that are able to operate without human intervention for long periods of time. The goal of this research is to build activity recognition approaches for such systems that are able to model and recognise simple short activities as well as complex longer activities arising from long-term autonomous operation of intelligent systems. The research makes the following key contributions: 1. We present a qualitative and quantitative representation to model simple activities as observed by autonomous systems. 2. We present a hierarchical framework to efficiently model complex activities that comprise of many sub-activities at varying levels of granularity. Simple activities are modelled using a discriminative model where a combined feature space, consisting of qualitative and quantitative spatio-temporal features, is generated in order to encode various aspects of the activity. Qualitative features are computed using qualitative spatio-temporal relations between human subjects and objects in order to abstractly represent the simple activity. Unlike current state-of-the-art approaches, our approach uses significantly fewer assumptions and does not require any knowledge about object types, their affordances, or the constituent activities of an activity. The optimal and most discriminating features are then extracted, using an entropy-based feature selection process, to best represent the training data. A novel approach for building models of complex long-term activities is presented as well. The proposed approach builds a hierarchical activity model from mark-up of activities acquired from multiple annotators in a video corpus. Multiple human annotators identify activities at different levels of conceptual granularity. Our method automatically infers a ‘part-of’ hierarchical activity model from this data using semantic similarity of textual annotations and temporal consistency. We then consolidate hierarchical structures learned from different training videos into a generalised hierarchical model represented as an extended grammar describing the over all activity. We then describe an inference mechanism to interpret new instances of activities. Simple short activity classes are first recognised using our previously learned generalised model. Given a test video, simple activities are detected as a stream of temporally complex low-level actions. We then use the learned extended grammar to infer the higher-level activities as a hierarchy over the low-level action input stream. We make use of three publicly available datasets to validate our two approaches of modelling simple to complex activities. These datasets have been annotated by multiple annotators through crowd-sourcing and in-house annotations. They consist of daily activity videos such as ‘cleaning microwave’, ‘having lunch in a restaurant’, ‘working in an office’ etc. The activities in these datasets have all been marked up at multiple levels of abstraction by multiple annotators, however no information on the ‘part-of’ relationship between activities is provided. The complexity of the videos and their annotations allows us to demonstrate the effectiveness of the proposed methods.
Supervisor: Cohn, Anthony ; Hogg, David Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID: uk.bl.ethos.759793  DOI: Not available
Share: