Title:

Spectral geometry for structural pattern recognition

Graphs are used pervasively in computer science as representations of data with a network or relational structure, where the graph structure provides a flexible representation such that there is no fixed dimensionality for objects. However, the analysis of data in this form has proved an elusive problem; for instance, it suffers from the robustness to structural noise. One way to circumvent this problem is to embed the nodes of a graph in a vector space and to study the properties of the point distribution that results from the embedding. This is a problem that arises in a number of areas including manifold learning theory and graphdrawing. In this thesis, our first contribution is to investigate the heat kernel embedding as a route to computing geometric characterisations of graphs. The reason for turning to the heat kernel is that it encapsulates information concerning the distribution of path lengths and hence node affinities on the graph. The heat kernel of the graph is found by exponentiating the Laplacian eigensystem over time. The matrix of embedding coordinates for the nodes of the graph is obtained by performing a YoungHouseholder decomposition on the heat kernel. Once the embedding of its nodes is to hand we proceed to characterise a graph in a geometric manner. With the embeddings to hand, we establish a graph characterization based on differential geometry by computing sets of curvatures associated with the graph nodes, edges and triangular faces. The second contribution comes from the need to solve the problem that arise in the processing of a noisy data over a graph. The Principal difficulty of this task, is how to preserve the geometrical structures existing in the initial data. Bringing together several, distinct concepts that have received some independent recent attention in machine learning; we propose a framework to regularize realvalued or vectorvalued functions on weighted graphs of arbitrary topology. The first of these is deduced from the concepts of the spectral graph theory that have been applied to a wide range of clustering and classification tasks over the last decades taking in consideration the properties of the graph \(p\)Laplacian as a nonlinear extension of the usual graph Laplacian. The second one is the geometric point of view comes from the heat kernel embedding of the graph into a manifold. In these techniques we use the geometry of the manifold by assuming that it has the geometric structure of a Riemannian manifold. The third important conceptual framework comes from the manifold regularization which extends the classical framework of regularization in the sense of reproducing Hilbert Spaces to exploit the geometry of the embedded set of points. The proposed framework, based on the \(p\)Laplacian operators considering minimizing a weighted sum of two energy terms: a regularization one and an additional approximation term which helps to avoid the shrinkage effects obtained during the regularization process. The data are structured by functions depending on data features, the curvature attributes associated with the geometric embedding of the graph. The third contribution is inspired by the concepts and techniques of the graph calculus of partial differential functions. We propose a new approach for embedding graphs on pseudoRiemannian manifolds based on the wave kernel which is the solution of the wave equation on the edges of a graph. The eigensystem of the wavekernel is determined by the eigenvalues and the eigenfunctions of the normalized adjacency matrix and can be used to solve the edgebased wave equation. By factorising the Grammatrix for the wavekernel, we determine the embedding coordinates for nodes under the wavekernel. The techniques proposed through this thesis are investigated as a means of gauging the similarity of graphs. We experiment on sets of graphs representing the proximity of image features in different views of different objects in three different datasets namely, the York model house, the COIL20 and the TOY databases.
