Use this URL to cite or link to this record in EThOS: https://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.746755
Title: Learning dense 3D models from monocular video
Author: Yu, Rui
ISNI:       0000 0004 7225 8553
Awarding Body: UCL (University College London)
Current Institution: University College London (University of London)
Date of Award: 2017
Availability of Full Text:
Access from EThOS:
Full text unavailable from EThOS. Please try the link below.
Access from Institution:
Abstract:
Reconstructing dense, detailed, 3D shape of dynamic scenes from monocular sequences is a challenging problem in computer vision. While robust and even real-time solutions exist to this problem if the observed scene is static, for non-rigid dense shape capture current systems are typically restricted to the use of complex multi-camera rigs, taking advantage of the additional depth channel available in RGB-D cameras, or dealing with specific shapes such as faces or planar surfaces. In this thesis, we present two pieces of work for reconstructing dense generic shapes from monocular sequences. In the first work, we propose an unsupervised approach to the challenging problem of simultaneously segmenting the scene into its constituent objects and reconstructing a 3D model of the scene. The strength of our approach comes from the ability to deal with real-world dynamic scenes and to handle seamlessly different types of motion: rigid, articulated and non-rigid. We formulate the problem as a hierarchical graph-cuts based segmentation where we decompose the whole scene into background and foreground objects and model the complex motion of non-rigid or articulated objects as a set of overlapping rigid parts. To validate the capability of our approach to deal with real-world scenes, we provide 3D reconstructions of some challenging videos from the YouTube Objects and KITTI dataset, etc. In the second work, we propose a direct approach for capturing the dense, detailed 3D geometry of generic, complex non-rigid meshes using a single camera. Our method makes use of a single RGB video as input; it can capture the deformations of generic shapes; and the depth estimation is dense, per-pixel and direct. We first reconstruct a dense 3D template of the shape of the object, using a short rigid sequence, and subsequently perform online reconstruction of the non-rigid mesh as it evolves over time. In our experimental evaluation, we show a range of qualitative results on novel datasets and quantitative comparison results with stereo reconstruction.
Supervisor: Agapito, L. Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID: uk.bl.ethos.746755  DOI: Not available
Share: