Use this URL to cite or link to this record in EThOS: https://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.748794
Title: Visual-inertial odometry, mapping and re-localization through learning
Author: Clark, Ronald
Awarding Body: University of Oxford
Current Institution: University of Oxford
Date of Award: 2017
Availability of Full Text:
Access from EThOS:
Full text unavailable from EThOS. Please try the link below.
Access from Institution:
Abstract:
Precise pose information is a fundamental prerequisite for numerous applications in robotics, AI and mobile computing. Monocular cameras are the ideal sensor for this purpose - they are cheap, lightweight and ubiquitous. As such, monocular visual localization is widely regarded as a cornerstone requirement of machine perception. However, a large gap still exists between the performance that these applications require and that which is achievable through existing monocular perception algorithms. In this thesis we directly tackle the issue of robust egocentric visual localization and mapping through a data-centric approach. As a first major contribution we propose novel learnt models for visual odometry which form the basis of the ego-motion estimates used in later chapters. The proposed approaches are less fragile and much more robust than existing approaches. We present experimental evidence that these approaches can not only approach the accuracy of standard methods but in many cases also show major improvements in computational and memory efficiency. To cope with the drift inherent to the odometry methods, we then introduce a novel learnt spatio-temporal model for performing global relocalization updates. The proposed approach allows one to efficiently infer the global location of an image stream at the fraction of the time of traditional feature-based approaches with minimal loss in localization accuracy. Finally, we present a novel SLAM system integrating our learnt priors for creating 3D maps from monocular image sequences. The approach is designed to harness multiple input sources, including prior depth and ego-motion estimates and incorporates both loop-closure and relocalization updates. The approach, based on the well-established standard visual-inertial structure-from-motion process, allows us to perform accurate posterior inference of camera poses and scene structure to significantly boost the reconstruction robustness and fidelity. Through our qualitative and quantitative experimentation on a wide range of datasets, we conclude that the proposed methods can bring accurate visual localization to a wide class of consumer devices and robotic platforms.
Supervisor: Markham, Andrew ; Trigoni, Agathoniki Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID: uk.bl.ethos.748794  DOI: Not available
Share: