Use this URL to cite or link to this record in EThOS:
Title: Scale, saliency and scene description
Author: Kadir, Timor
ISNI:       0000 0004 2704 0168
Awarding Body: Oxford University
Current Institution: University of Oxford
Date of Award: 2002
Availability of Full Text:
Full text unavailable from EThOS.
Please contact the current institution’s library for further details.
This thesis develops a novel information theoretic methodology addressing three intrinsically related problems in vision: saliency, scale and description. The fundamental principle underpinning the proposed approach is the spatial (un)predictability of image attributes. The thesis is concerned with the Scene Description task — the automatic extraction of a set of robust, relevant, and sufficiently complete semantic descriptions of a scene, for subsequent inference. This task is essential for any application where the efficient and semantic representation of image-based data is necessary, for example data-mining and communications systems. The main challenge in this task is to extract the descriptions without assumptions on the exact nature of the subsequent inferences. Clearly, without prior knowledge the problem is intractable; this thesis addresses the questions : “how much is needed and at what stage need it be applied?” Many approaches to vision tend to concentrate on specific scene entities (or ‘objects’), and hence do not capture a complete description. Those that do tend to be brittle and lack the necessary semantic level of description. Motivated by the work of Gilles (1998) and recent successes of local appearance-based methods, a novel algorithm, called Scale Saliency, for quantifying image region saliency is presented. In this new approach, regions are considered salient if they are simultaneously unpredictable both in some feature and scale-space. Unpredictability is determined as a function of the local PDF, generating a space of saliency values in R3 (x, y and scale), from which features may be extracted by a suitable detection strategy. The technique is a more generic approach to saliency compared to conventional methods, because saliency is defined independent of a particular basis morphology. The method can be made invariant to rotation, translation, non-uniform scaling, and uniform intensity variations and robust to small changes in viewpoint. The algorithm is applied to simple recognition tasks and the features shown to be robust and persistent (hence useful for tracking). The relevance of the scales and the generality of saliency is demonstrated by using the PDF of salient scales to characterise textures; classification and unsupervised segmentation results are presented. A ‘by-product’ of this work is that salient scales themselves make good descriptors of texture. For the texture segmentation experiments, a novel unsupervised Level Set based implementation of Region Competition is developed. The key aspect of this is that it operates on just one surface. Generalised Region Competition evolution equations are presented. Finally, a unified approach to image modelling is proposed based on two scales of spatial unpredictability — the local and the semi-local. Quantifying the unpredictability of image attributes at these two scales enables a space of image models that can represent several different image content types such as blobs, lines, statistical and structural textures in a unified framework.
Supervisor: Brady, J. M. Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available