Use this URL to cite or link to this record in EThOS:
Title: Non-Euclidean dissimilarity data in pattern recognition
Author: Xu, Weiping
Awarding Body: University of York
Current Institution: University of York
Date of Award: 2012
Availability of Full Text:
Access through EThOS:
Access through Institution:
This thesis addresses problems in dissimilarity (proximity) learning, particularly focusing on identifying the sources and rectifying the non-Euclidean dissimilarity in pattern recog- nition. We aim to develop a framework for analyzing the non-Euclidean dissimilarity by combining the methods from differential geometry and manifold learning theory. The algorithms are applied to objects represented by the dissimilarity measures. In Chapter 3 we describe how to reveal the origins of the non-Euclidean behaviors of the dissimilarity matrix for the purpose of rectifying the dissimilarities. We com- mence by developing a new measure which gauges the extent to which individual data give rises to departures from metricity in a set of dissimilarity data. This allows us to as- sess whether the non-Euclidean artifacts in a dataset can be attributed to individual objects or are distributed uniformly. The second novel contribution of Chapter 3 is to provide sim- ple empirical tests that can be used to determine the sources of the negative dissimilarity eigenvalues. We consider three sources of the negative dissimilarity eigenvalues, namely a) that the data resides on a manifold, b) that the objects may be extended and c) that there is Gaussian noise. We experiment with the algorithms on a set of public dissimilarities used in various applications available from the EU SIMBAD project. In Chapter 4, we propose a framework for rectifying the dissimilarities using Ricci flow on the manifolds so that the non-Euclidean artifacts are eliminated, as the second main contribution of this thesis. We consider the objects of interest to be represented by points on a manifold consisting of local patches with constant curvatures, and the given dissimilarities to be the geodesic distances on the manifold between these points. In dif- ferential geometry, Ricci flow changes the metric of a Riemannian manifold according to the curvature of the manifold. We seek to flatten the curved manifold so that a corrected set of Euclidean distances are obtained. We achieve this by deforming the manifold usingRicci flow. In the first technique, we consider each edge as a local patch and apply Ricci flow independently to flatten each patch. In this way, the local structure of the manifold is ignored, as Ricci flow is applied independently on each edge. To overcome this prob- lem, we propose a second technique, where add a curvature regularization process before evolving the manifold. Specifically we use the heat kernel to smooth out the curvatures on the edges. The results show both improved numerical stability and lower classification error in the embedded space. To reduce the reliance on the piecewise embedding and its effects on individual edges, we extend the previous two techniques and develop a third means of correcting non- Euclidean dissimilarity data as the first contribution of Chapter 5. This is done by using a tangent space reprojection to inflate the local hyperspherical patches and align the local patches with the shortest edge-connected path. These three Ricci-flow-based techniques proposed through this thesis are investigated as a means of correcting the dissimilarities so that the the non-Euclidean artefacts are eliminated. We experiment on two datasets represented by dissimilarities, namely the CoilYork and the Chickenpieces datasets. In the framework for correcting the non-Euclidean dissimilarities using the Ricci flow process, estimating the curvatures of the embedded manifold is an important component prior deforming the manifold. The second contribution of Chapter 5 is the investigation of the effects of the piecewise embedding methods (the kernel embedding and the Isomap embedding) on the curvatures computation and the introduction of a new way of com- puting the curvatures from a set of dissimilarities. We consider each local patch on a hypersphere, and deduce the enclosed volume of the points in terms of the curvature. We estimate the curvature by fitting the volume. We illustrate the utility of this method for estimating curvatures on the artificial dataset (2-sphere dataset).
Supervisor: Not available Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available