Use this URL to cite or link to this record in EThOS:
Title: Scalable inference and private co-training for Gaussian processes
Author: Thomas, Owen Matthew Truscott
Awarding Body: University of Oxford
Current Institution: University of Oxford
Date of Award: 2017
Availability of Full Text:
Access from EThOS:
Full text unavailable from EThOS. Please try the link below.
Access from Institution:
Two principal problems are pursued in this thesis: that of scaling inference for Gaussian process regression to very large numbers of data points, and that of differentially private co-training between multiple Gaussian processes with distinct private views of the data. The first chapter acts as an introduction to Bayesian nonparametric regression and standard techniques for performing scalable inference and differentially private communication with Gaussian Processes. The second chapter explores the use of Tucker decomposition and Kronecker structure of variational distributions in order to use very large numbers of inducing points for scalable variational Gaussian processes. The methods use a massive number of gridded inducing points, enabling more expansive variational approximations, but the computational cost of the methods developed scales sublinearly with m, the number of inducing points used. Computational experiments are performed and compared with standard variational Gaussian processes, and the new method is demonstrated to be more efficient for data sets of moderate dimensionality. The third chapter pursues the use of stochastic algorithms for evaluating unbiased approximations to the gradients of the lower bound for a scalable variational Gaussian process. Consequently, O(m2) inference is possible, in contrast to the O(m3) inference for a standard variational Gaussian process. Systematic computational experiments are performed and compared with a standard variational Gaussian process, with mixed results. The fourth chapter considers the problem of preserving privacy when engaging in co-training for regression with multiple Gaussian processes, each trained on different sets of covariates or 'views'. Information is exchanged between views with predictions on unlabelled data: various methods are introduced to incorporate the predictive distributions, and mechanisms for formally preserving the differential privacy of the response variable are included. Analogies are drawn between the Kronecker product across dimensions of the second chapter and the fourth chapter's co-training between views. Experiments are performed on simulated and real data, and the results are presented and analysed. It is shown that differentially private exchange of information between views via predictions on unlabelled data can improve the performance of the models. The fifth chapter concludes.
Supervisor: Holmes, Christopher C. Sponsor: Engineering and Physical Sciences Research Council
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available