Use this URL to cite or link to this record in EThOS: https://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.799924
Title: Probabilistic machine learning : methods and applications to continuous control
Author: Hasenclever, Leonard
ISNI:       0000 0004 8506 8775
Awarding Body: University of Oxford
Current Institution: University of Oxford
Date of Award: 2018
Availability of Full Text:
Access from EThOS:
Full text unavailable from EThOS. Please try the link below.
Access from Institution:
Abstract:
Probabilistic inference is at the core of many recent advances in machine learning. Unfortunately, exact inference is intractable in all but the simplest models. Thus, approximate inference methods are required use probabilistic methods for large and complex models. Broadly speaking, there are two different paradigms for approximate inference, sampling methods and variational methods. Sampling methods attempt to construct approximate samples from the target distribution, which can then be used to approximate expectations with respect to the posterior. Variational methods instead rephrase inference as an optimisation problem and form parametric approximations to the target distribution. In this thesis, we present contributions to sampling methods and variational methods with a focus on scalability. Firstly, we introduce a novel sampling technique based on Hamiltonian Monte Carlo that uses a relativistic kinetic energy to improve robustness to hyperparameters. We then describe a novel algorithm for distributed Bayesian learning based on expectation propagation techniques. In addition, we present a novel normalising flow that can be used to form more flexible variational approximations within variational inference. We then describe two applications of probabilistic thinking and variational techniques to the field a continuous control. Firstly, we describe how reinforcement learning can be viewed as probabilistic inference and introduce a novel algorithm for learning priors in reinforcement learning leading to substantial improvement in learning speed and final performance in certain settings. Lastly, we describe a probabilistic model that can be used to compress thousands of expert policies trained to reproduce motion capture data into one model that is capable of one-shot imitation. We further demonstrate that it is possible to reuse our model, resulting in naturalistic movements on challenging control tasks.
Supervisor: Teh, Yee Whye Sponsor: Engineering and Physical Sciences Research Council
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID: uk.bl.ethos.799924  DOI: Not available
Share: