Use this URL to cite or link to this record in EThOS:
Title: 3D hand pose regression with variants of decision forests
Author: Tang, Danhang
ISNI:       0000 0004 5920 8594
Awarding Body: Imperial College London
Current Institution: Imperial College London
Date of Award: 2015
Availability of Full Text:
Access from EThOS:
Full text unavailable from EThOS. Please try the link below.
Access from Institution:
3D hand pose regression is a fundamental component in many modern human computer interaction applications such as sign language recognition, virtual object manipulation, game control, etc. This thesis focuses on the scope of 3D pose regression with a single hand from depth data. The problem has many challenges including high degrees of freedom, severe viewpoint changes, self-occlusion and sensor noise. The main contributions of this work are to propose a series of decision forest-based methods in a progressive manner, which improves upon the previous and achieves state-of-the-art performance is achieved in the end. The thesis first introduces a novel algorithm called semi-supervised transductive regression forest, which combines transductive learning and semi-supervised learning to bridge the gap between synthetically generated, noise-free training data and real noisy data. Moreover, it incorporates a coarse-to-fine training quality function to handle viewpoint changes in a more efficient manner. As a patch-based method, STR forest has high complexity during inference. To handle that, this thesis proposes latent regression forest, a method that models the pose estimation problem as a coarse-to-fine search. This inherently combines the efficiency of a holistic method and the flexibility of a patch-based method, and thus results in 62.5 FPS without CPU/GPU optimisation. Targeting the drawbacks of LRF, a new algorithm called hierarchical sampling forests is proposed to model this problem as a progressive search, guided by kinematic structure. Hence the intermediate results (partial poses) can be verified by a new efficient energy function. Consequently it can produce more accurate full poses. All these methods are thoroughly described, compared and published. In the conclusion part we discuss and analyse their differences, limitations and usage scenarios, and then propose a few ideas for future work.
Supervisor: Kim, Tae-Kyun Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available