Use this URL to cite or link to this record in EThOS:
Title: 3D hand pose estimation : methods, datasets, and challenges
Author: Yuan, Shanxin
ISNI:       0000 0004 7963 7812
Awarding Body: Imperial College London
Current Institution: Imperial College London
Date of Award: 2019
Availability of Full Text:
Access from EThOS:
Access from Institution:
3D hand pose estimation is an important task in the computer vision community due to its vast applications, including but not limited to, human computer interaction, virtual reality and augmented reality, sign language recognition, medical image analysis. The challenges for this task lie in high degree of freedom of a human hand, self-occlusions, different hand shapes, ambiguities among different fingers. The obstacles faced by the research communities are lack of proper methods and limitation in the current datasets. In view of this, I investigate in this thesis in three aspects: methods, datasets, and challenges. More specifically, the contributions of this thesis are: (1) Proposed a large-scale hand pose dataset, collected using a novel capture method, the dataset is known as the BigHand2.2M dataset; (2) Hosted a depth-based 3D hand pose challenge that attracted the top research groups across the world to evaluate the current best state-of-the-art methods, to investigate into the best practices, and to show some promising research directions; (3) Proposed a method for 3D hand pose estimation from RGB images with privileged information from depth data. Real datasets are limited in quantity and coverage, mainly due to the difficulty to annotate them. To deal with this issue, this thesis proposed a tracking system with six magnetic 6D sensors and inverse kinematics to automatically obtain 21-joints hand pose annotations of depth maps captured with minimal restriction on the range of motion. The automatic annotation method allows us to build the largest real dataset with higher joint annotations accuracy. To find out the current state of 3D hand pose estimation from depth and the next challenges, we hosted the Hands In the Million Challenge (HIM2017), and investigated the state-of-the-art methods on three tasks: single frame 3D pose estimation, 3D hand tracking, and hand pose estimation during object interaction. This thesis analysed the performance of different CNN structures with regard to hand shape, joint visibility, view point and articulation distributions. The advancement in hand pose estimation from RGB images lagged behind that of depth images, this thesis proposed a method for hand pose estimation from RGB images that uses both external large-scale depth image datasets and paired depth and RGB images as privileged information at training time. We show that providing depth information during training significantly improves performance of pose estimation from RGB images during testing.
Supervisor: Kim, Tae-Kyun Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral