Use this URL to cite or link to this record in EThOS:
Title: Formally justified and modular Bayesian inference for probabilistic programs
Author: Ścibior, Adam Michał
ISNI:       0000 0004 7968 5849
Awarding Body: University of Cambridge
Current Institution: University of Cambridge
Date of Award: 2019
Availability of Full Text:
Access from EThOS:
Full text unavailable from EThOS. Please try the link below.
Access from Institution:
Probabilistic modelling offers a simple and coherent framework to describe the real world in the face of uncertainty. Furthermore, by applying Bayes' rule it is possible to use probabilistic models to make inferences about the state of the world from partial observations. While traditionally probabilistic models were constructed on paper, more recently the approach of probabilistic programming enables users to write the models in executable languages resembling computer programs and to freely mix them with deterministic code. It has long been recognised that the semantics of programming languages is complicated and the intuitive understanding that programmers have is often inaccurate, resulting in difficult to understand bugs and unexpected program behaviours. Programming languages are therefore studied in a rigorous way using formal languages with mathematically defined semantics. Traditionally formal semantics of probabilistic programs are defined using exact inference results, but in practice exact Bayesian inference is not tractable and approximate methods are used instead, posing a question of how the results of these algorithms relate to the exact results. Correctness of such approximate methods is usually argued somewhat less rigorously, without reference to a formal semantics. In this dissertation we formally develop denotational semantics for probabilistic programs that correspond to popular sampling algorithms often used in practice. The semantics is defined for an expressive typed lambda calculus with higher-order functions and inductive types, extended with probabilistic effects for sampling and conditioning, allowing continuous distributions and unbounded likelihoods. It makes crucial use of the recently developed formalism of quasi-Borel spaces to bring all these elements together. We provide semantics corresponding to several variants of Markov chain Monte Carlo and Sequential Monte Carlo methods and formally prove a notion of correctness for these algorithms in the context of probabilistic programming. We also show that the semantic construction can be directly mapped to an implementation using established functional programming abstractions called monad transformers. We develop a compact Haskell library for probabilistic programming closely corresponding to the semantic construction, giving users a high level of assurance in the correctness of the implementation. We also demonstrate on a collection of benchmarks that the library offers performance competitive with existing systems of similar scope. An important property of our construction, both the semantics and the implementation, is the high degree of modularity it offers. All the inference algorithms are constructed by combining small building blocks in a setup where the type system ensures correctness of compositions. We show that with basic building blocks corresponding to vanilla Metropolis-Hastings and Sequential Monte Carlo we can implement more advanced algorithms known in the literature, such as Resample-Move Sequential Monte Carlo, Particle Marginal Metropolis-Hastings, and Sequential Monte Carlo squared. These implementations are very concise, reducing the effort required to produce them and the scope for bugs. On top of that, our modular construction enables in some cases deterministic testing of randomised inference algorithms, further increasing reliability of the implementation.
Supervisor: Ghahramani, Zoubin Sponsor: Engineering and Physical Sciences Research Council ; Cambridge Trust ; Cambridge-Tuebingen Programme
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
Keywords: probabilistic programming ; machine learning ; Bayesian inference ; functional programming ; denotational semantics ; Haskell ; Monte Carlo ; Markov chain Monte Carlo ; Sequential Monte Carlo ; quasi-Borel spaces ; monads ; monad transformers ; programming languages ; lambda calculus ; importance sampling