Use this URL to cite or link to this record in EThOS: http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.683855
Title: Real time hand pose estimation for human computer interaction
Author: Krejov, Philip G.
ISNI:       0000 0004 5918 8765
Awarding Body: University of Surrey
Current Institution: University of Surrey
Date of Award: 2016
Availability of Full Text:
Access from EThOS:
Access from Institution:
Abstract:
The aim of this thesis is to address the challenge of real-time pose estimation of the hand. Specifically this thesis aims to determine the joint positions of a non-augmented hand. This thesis focuses on the use of depth, performing localisation of the parts of the hand for efficient fitting of a kinematic model and consists of four main contributions. The first contribution presents an approach to Multi-touch(less) tracking, where the objective is to track the fingertips with a high degree of accuracy without sensor contact. Using a graph based approach, the surface of the hand is modelled and extrema of the hand are located. From this, gestures are identified and used for interaction. We briefly discuss one use case for this technology in the context of the Making Sense demonstrator inspired by the film ”The Minority Report”. This demonstration system allows an operator to quickly summarise and explore complex multi-modal multimedia data. The tracking approach allows for collaborative interactions due to its highly efficient tracking, resolving 4 hands simultaneously in real-time. The second contribution applies a Randomised Decision Forest (RDF) to the problem of pose estimation and presents a technique to identify regions of the hand, using features that sample depth. The RDF is an ensemble based classifier that is capable of generalising to unseen data and is capable of modelling expansive datasets, learning from over 70,000 pose examples. The approach is also demonstrated in the challenging application of American Sign Language (ASL) fingerspelling recognition. The third contribution combines a machine learning approach with a model based method to overcome the limitations of either technique in isolation. A RDF provides initial segmentation allowing surface constraints to be derived for a 3D model, which is subsequently fitted to the segmentation. This stage of global optimisation incorporates temporal information and enforces kinematic constraints. Using Rigid Body Dynamics for optimisation, invalid poses due to self-intersection and segmentation noise are resolved. Accuracy of the approach is limited by the natural variance between users and the use of a generic hand model. The final contribution therefore proposes an approach to refine pose via cascaded linear regression which samples the residual error between the depth and the model. This combination of techniques is demonstrated to provide state of the art accuracy in real time, without the use of a GPU and without the requirement for model initialisation.
Supervisor: Bowden, Richard ; Gilbert, Andrew Sponsor: EPSRC
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID: uk.bl.ethos.683855  DOI: Not available
Share: