Use this URL to cite or link to this record in EThOS: https://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.788972
Title: Outlier removal in real-time object recognition and pose estimation
Author: Shao, Mang
ISNI:       0000 0004 8499 4777
Awarding Body: Imperial College London
Current Institution: Imperial College London
Date of Award: 2019
Availability of Full Text:
Access from EThOS:
Access from Institution:
Abstract:
Outlier removal algorithms aim to detect and remove abnormal or negative data which sufficiently differ from training samples. Since most object recognition or pose estimation methods involve a hypothesise-and-test scheme, especially for large-scale or real-time problem, outlier removal algorithms can be essential for desirable performance. Unlike domain adaptation or transfer learning, outlier removal algorithms usually do not have prior knowledge of negative samples during training. Rather than having a universal solution, performing outlier removal algorithm usually depends on the task and the applied machine learning technique. In this thesis, we investigate the application of outlier removal algorithm in object recognition and pose estimation problems. Specifically, we classify them into three types and investigate one application from each: a comparative study for object recognition in video as the distance-based approach; a new grouped outlier removal method for robust ellipse fitting as the registration-based approach; and a novel real-time background-aware 3D texture-less pose estimation method as the learning-based approach. The comparative study is centred around using a wide choice of spatial and temporal consistencies to remove outlier feature points. State-of-the-art techniques are classified, implemented under a unified framework, and empirically evaluated with a newly collected museum dataset. For geometric cues, we find that 3D object structure learnt from a training video dataset improves the average video classification performance dramatically. By contrast, for temporal cues, tracking visual fixation among video sequences has little impact on the accuracy, but significantly removes background feature points and reduces memory consumption. Furthermore, we propose a method that integrates these two cues to exploit the advantages of both. Then, we presents a registration-based outlier removal method which is capable of fitting ellipse in real-time under high outlier rate, based on the phenomenon that outliers generated by ellipse edge point detector are likely to appear as groups due to real-world nuisances, such as under partial occlusion or illumination change. To confront the grouped outliers while maintaining the fitting efficiency, we introduce a proximity-based 'split and merge' approach to cluster the edge points, followed by a breadth-first outlier removal process. The experiment shows that our algorithm achieves high performance under a wide range of outlier ratio and noise level with various types of realistic nuisances. An outlier-aware extension of randomised decision forest is proposed and applied to real-time 3D object pose estimation problem based on typical template matching methods. A set of templates uniformly covering the pose space is generated during training and the nearest neighbour to query point is found during testing. Since the amount of data raised from the background is in the orders of magnitude more than foreground during testing, it is desirable to reject the background early to save computational power as much as possible. Hence the conventional randomised decision tree is modified to a ternary tree, where each node, apart from the original children, contains an additional 'background rejection' node. During testing, the query data far from training samples will be detected and rejected along the propagation down the trees. Furthermore, we propose the application of 'fuzzy decision' instead of binary when training the decision forest to raise the tolerant to ambiguous data samples so that the sample near the decision boundary will be assigned to both left and right child nodes. Our approach is also scalable to large datasets, since the tree structure naturally provides a logarithm time complexity to the number of objects. Finally, we further reduce the validation stage with a fast breadth-first scheme. The results show that our approach outperforms state-of-the-arts on efficiency while maintaining comparable accuracy.
Supervisor: Kim, Tae-Kyun Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID: uk.bl.ethos.788972  DOI:
Share: