Use this URL to cite or link to this record in EThOS:
Title: On auxiliary variables and many-core architectures in computational statistics
Author: Lee, Anthony
Awarding Body: University of Oxford
Current Institution: University of Oxford
Date of Award: 2011
Availability of Full Text:
Access through EThOS:
Full text unavailable from EThOS. Restricted access.
Access through Institution:
Emerging many-core computer architectures provide an incentive for computational methods to exhibit specific types of parallelism. Our ability to perform inference in Bayesian statistics is often dependent upon our ability to approximate expectations of functions of random variables, for which Monte Carlo methodology provides a general purpose solution using a computer. This thesis is primarily concerned with exploring the gains that can be obtained by using many-core architectures to accelerate existing population-based Monte Carlo algorithms, as well as providing a novel general framework that can be used to devise new population-based methods. Monte Carlo algorithms are often concerned with sampling random variables taking values in X whose density is known up to a normalizing constant. Population-based methods typically make use of collections of interacting auxiliary random variables, each of which is in X, in specifying an algorithm. Such methods are good candidates for parallel implementation when the collection of samples can be generated in parallel and their interaction steps are either parallelizable or negligible in cost. The first contribution of this thesis is in demonstrating the potential speedups that can be obtained for two common population-based methods, population-based Markov chain Monte Carlo (MCMC) and sequential Monte Carlo (SMC). The second contribution of this thesis is in the derivation of a hierarchical family of sparsity-inducing priors in regression and classification settings. Here, auxiliary variables make possible the implementation of a fast algorithm for finding local modes of the posterior density. SMC, accelerated on a many-core architecture, is then used to perform inference for a range of prior specifications to gain an understanding of sparse association signal in the context of genome-wide association studies. The third contribution is in the use of a new perspective on reversible MCMC kernels that allows for the construction of novel population-based methods. These methods differ from most existing methods in that one can make the resulting kernels define a Markov chain on X. A further development is that one can define kernels in which the number of auxiliary variables is given a distribution conditional on the values of the auxiliary variables obtained so far. This is perhaps the most important methodological contribution of the thesis, and the adaptation of the number of particles used within a particle MCMC algorithm provides a general purpose algorithm for sampling from a variety of complex distributions.
Supervisor: Holmes, Christopher Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available
Keywords: Statistics (see also social sciences) ; Computationally-intensive statistics ; Markov chain Monte Carlo ; Sequential Monte Carlo ; Bayesian Inference