Use this URL to cite or link to this record in EThOS:
Title: Real-time 3D morphable shape model fitting to monocular in-the-wild videos
Author: Huber, Patrik
ISNI:       0000 0004 6422 8768
Awarding Body: University of Surrey
Current Institution: University of Surrey
Date of Award: 2017
Availability of Full Text:
Access from EThOS:
Access from Institution:
Reconstructing 3D face shape from a single 2D photograph as well as from video is an inherently ill-posed problem with many ambiguities. One way to solve some of the ambiguities is using a 3D face model to aid the task. 3D Morphable Face Models (3DMMs) are amongst the state of the art methods for 3D face reconstruction, or so called 3D model fitting. However, current existing methods have severe limitations, and most of them have not been trialled on in-the-wild data. Current analysis-by-synthesis methods form complex non-linear optimisation processes, and optimisers often get stuck in local optima. Further, most existing methods are slow, requiring in the order of minutes to process one photograph. This thesis presents an algorithm to reconstruct 3D face shape from a single image as well as from sets of images or video frames in real-time. We introduce a solution for linear fitting of a PCA shape identity model and expression blendshapes to 2D facial landmarks. To improve the accuracy of the shape, a fast face contour fitting algorithm is introduced. These different components of the algorithm are run in iteration, resulting in a fast, linear shape-to-landmarks fitting algorithm. The algorithm, specifically designed to fit to landmarks obtained from in-the-wild images, by tackling imaging conditions that occur in in-the-wild images like facial expressions and the mismatch of 2D–3D contour correspondences, achieves the shape reconstruction accuracy of much more complex, nonlinear state of the art methods, while being multiple orders of magnitudes faster. Second, we address the problem of fitting to sets of multiple images of the same person, as well as monocular video sequences. We extend the proposed shape-tolandmarks fitting to multiple frames by using the knowledge that all images are from the same identity. To recover facial texture, the approach uses texture from the original images, instead of employing the often-used PCA albedo model of a 3DMM. We employ an algorithm that merges texture from multiple frames in real-time based on a weighting of each triangle of the reconstructed shape mesh. Last, we make the proposed real-time 3D morphable face model fitting algorithm available as open-source software. In contrast to ubiquitous available 2D-based face models and code, there is a general lack of software for 3D morphable face model fitting, hindering a widespread adoption. The library thus constitutes a significant contribution to the community.
Supervisor: Kittler, Josef Sponsor: Centre for Vision, Speech and Signal Processing (CVSSP) ; Cognitec Systems GmbH
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available