Use this URL to cite or link to this record in EThOS: http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.616288
Title: Efficient object detection via structured learning and local classifiers
Author: Zhang, Ziming
Awarding Body: Oxford Brookes University
Current Institution: Oxford Brookes University
Date of Award: 2013
Availability of Full Text:
Access through EThOS:
Access through Institution:
Abstract:
Object detection has made great strides recently. However, it is still facing two big challenges: detection accuracy and computational efficiency. In this thesis, we present an automatic efficient object detection frarnework to detect object instances ·in images using bounding boxes, which can be trained and tested easily on current personal computers. Our framework is a sliding-window based approach, and consists of two major components: (1) efficient object proposal generation, predicting possible object bounding boxes, and (2) efficient object proposal verification, classifying each bounding box in a multiclass manner. For object proposal generation, we formulate this problem as a structured learning problem and investigate structural support vector machines (SSVMs) with our proposed scale/aspect-ratio quantization scheme and ranking constraints. A general ranking-order decomposition algorithm is developed for solving the formulation efficiently, and applied to generate proposals using a two-stage cascade. Using image gradients as features, our object proposal generation method achieves state-of-the-art results in terms Df object recall at a low cost in computation. For object proposal verification, we propose two locally linear and one locally nonlinear classifiers to approximate the nonlinear decision boundaries in the feature space efficiently. Inspired by the kernel trick, these classifiers map the original features into another feature space explicitly where linear classifiers are employed for classification, and thus have linear computational complexity in both training and testing, similar to that of linear classifiers. Therefore, in general, our classifiers can achieve comparable accuracy to kernel based classifiers at the cost of lower computational time. To demonstrate its efficiency and generality, our framework is applied to four different object detection tasks: VOC detection challenges, traffic sign detection, pedestrian detection, and face detection. In each task, it can perform reasonably well with acceptable detection accuracy and good computational efficiency. For instance, on VOC datasets with 20 object classes, our method achieved about 0.1 mean average precision (AP) within 2 hours of training and 0.05 second of testing a 500 x 300 pixel image using a mixture of MATLAB and C++ code on a current personal computer.
Supervisor: Not available Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID: uk.bl.ethos.616288  DOI: Not available
Share: