Use this URL to cite or link to this record in EThOS:
Title: Advancing human pose and gesture recognition
Author: Pfister, Tomas
ISNI:       0000 0004 5354 3658
Awarding Body: University of Oxford
Current Institution: University of Oxford
Date of Award: 2015
Availability of Full Text:
Access from EThOS:
Full text unavailable from EThOS. Restricted access.
Access from Institution:
This thesis presents new methods in two closely related areas of computer vision: human pose estimation, and gesture recognition in videos. In human pose estimation, we show that random forests can be used to estimate human pose in monocular videos. To this end, we propose a co-segmentation algorithm for segmenting humans out of videos, and an evaluator that predicts whether the estimated poses are correct or not. We further extend this pose estimator to new domains (with a transfer learning approach), and enhance its predictions by predicting the joint positions sequentially (rather than independently) in an image, and using temporal information in the videos (rather than predicting the poses from a single frame). Finally, we go beyond random forests, and show that convolutional neural networks can be used to estimate human pose even more accurately and efficiently. We propose two new convolutional neural network architectures, and show how optical flow can be employed in convolutional nets to further improve the predictions. In gesture recognition, we explore the idea of using weak supervision to learn gestures. We show that we can learn sign language automatically from signed TV broadcasts with subtitles by letting algorithms 'watch' the TV broadcasts and 'match' the signs with the subtitles. We further show that if even a small amount of strong supervision is available (as there is for sign language, in the form of sign language video dictionaries), this strong supervision can be combined with weak supervision to learn even better models.
Supervisor: Zisserman, Andrew Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available
Keywords: Computing ; Image understanding ; Information engineering ; computer vision ; deep learning ; machine learning ; human pose estimation ; gesture recognition