Use this URL to cite or link to this record in EThOS: https://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.791876
Title: Gaussian process modelling for audio signals
Author: Wilkinson, William J.
Awarding Body: Queen Mary University of London
Current Institution: Queen Mary, University of London
Date of Award: 2019
Availability of Full Text:
Access from EThOS:
Access from Institution:
Abstract:
Audio signals are characterised and perceived based on how their spectral make-up changes with time. Uncovering the behaviour of latent spectral components is at the heart of many real-world applications involving sound, but is a highly ill-posed task given the infi nite number of ways any signal can be decomposed. This motivates the use of prior knowledge and a probabilistic modelling paradigm that can characterise uncertainty. This thesis studies the application of Gaussian processes to audio, which offer a principled non-parametric way to specify probability distributions over functions whilst also encoding prior knowledge. Along the way we consider what prior knowledge we have about sound, the way it behaves, and the way it is perceived, and write down these assumptions in the form of probabilistic models. We show how Bayesian time-frequency analysis can be reformulated as a spectral mixture Gaussian process, and utilise modern day inference methods to carry out joint time-frequency analysis and nonnegative matrix factorisation. Our reformulation results in increased modelling flexibility, allowing more sophisticated prior knowledge to be encoded, which improves performance on a missing data synthesis task. We demonstrate the generality of this paradigm by showing how the joint model can additionally be applied to both denoising and source separation tasks without modi cation. We propose a hybrid statistical-physical model for audio spectrograms based on observations about the way amplitude envelopes decay over time, as well as a nonlinear model based on deep Gaussian processes. We examine the benefi ts of these methods, all of which are generative in the sense that novel signals can be sampled from the underlying models, allowing us to consider the extent to which they encode the important perceptual characteristics of sound.
Supervisor: Not available Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID: uk.bl.ethos.791876  DOI: Not available
Keywords: Electronic Engineering & Computer Science ; Gaussian Models ; Time-Frequency Analysis ; Audio analysis
Share: