Use this URL to cite or link to this record in EThOS:
Title: Robust aggregation of local image descriptors for visual search
Author: Husain, Syed S.
ISNI:       0000 0004 5923 2930
Awarding Body: University of Surrey
Current Institution: University of Surrey
Date of Award: 2016
Availability of Full Text:
Access from EThOS:
Access from Institution:
Visual search and recognition underpins numerous applications including management of multimedia content, mobile commerce, surveillance, navigation, robotics and many others. However the task is still challenging predominantly due to the variability of object appearance and ever increasing size of the databases, often exceeding billions of images. The objective of this thesis is to develop a robust, compact and discriminative image representation suitable for tasks of visual search. This thesis contributes to four research areas. First we propose a novel method, named Robust Visual Descriptor (RVD), for deriving a compact and robust representation of image content which significantly advances state of the art and delivers world-class performance. In our approach, the local descriptors are assigned to multiple cluster centres with rank weights leading to a stable and reliable global image representation. Residual vectors are then computed in each cluster, normalized using a direction preserving normalization and aggregated based on the neighbourhood rank information. We then propose two extensions to the core RVD descriptor. The first one consists of de-correlating weighted residual vectors by applying cluster level PCA before aggregation. In the second extension, the weighted residual vectors are whitened in each cluster before aggregation, leading to a balanced energy distribution in each dimension and improved performance. Compressing floating point global signatures to binary codes improves storage requirements and matching speed for large scale image retrieval tasks. Our third contribution is to derive a compact and robust binary image signature from the core RVD representation. In addition, we propose a novel binary descriptors matching algorithm, PCAE with Weighted Hamming distance (PCAE+WH), to minimize the quantization loss associated with converting floating point vector to discrete binary codes. In the context of industry work on Compact descriptors for Visual Search (CDVS) and its standardization in MPEG (ISO), we propose a scalable RVD representation. The bitrate scalability is achieved by employing novel Cluster Selection and Bit Selection mechanisms which support interoperable binary RVD representations. Moreover, we propose a very efficient and effective score function based on weighted Hamming distance, to compute similarity between two binary representations. Our fourth contribution is to develop an image classification system based on RVD representation. We introduce an effective method to incorporate second order statistics in the original RVD framework.
Supervisor: Bober, Miroslaw Sponsor: Centre for Vision, Speech and Signal Processing
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available