Use this URL to cite or link to this record in EThOS: http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.494716
Title: Bayesian algorithms for speech enhancement
Author: Andrianakis, Ioannis
ISNI:       0000 0001 3424 4175
Awarding Body: University of Southampton
Current Institution: University of Southampton
Date of Award: 2007
Availability of Full Text:
Access through EThOS:
Access through Institution:
Abstract:
The portability of modern voice processing devices allows them to be used in environments where background noise conditions can be adverse. Background noise can deteriorate the quality of speech transmitted through such devices, but speech enhancement algorithms can ameliorate this degradation to some extent. The development of speech enhancement algorithms that improve the quality of noisy speech is the aim of this thesis, which consists of three main parts. In the first part, we propose a framework of algorithms that estimate the clean speech Short Time Fourier Transform (STFT) coefficients. The algorithms are derived from the Bayesian theory of estimation and can be grouped according to i) the STFT representation they estimate ii) the estimator they apply and iii) the speech prior density they assume. Apart from the introduction of algorithms that surpass the performance of similar algorithms that exist in the literature, the compilation of the above framework offers insight on the effect and relative importance of the different components of the algorithms (e.g. prior, estimator) to the quality of the enhanced speech. In the second part of this thesis, we develop methods for the estimation of the power of time varying noise. The main outcome is a method that exploits some similarities between the distribution of the noisy speech spectral amplitude coefficients within a single frequency bin, and the corresponding distribution of the corrupting noise. The above similarities allow the extraction of samples that are more likely to correspond to noise, from a window of past spectral amplitude observations. The extracted samples are then used to produce an estimate of the noise power. In the final part of this thesis, we are concerned with the incorporation of the time and frequency dependencies of speech signals in our estimation model. The theoretical framework on which the modelling is based is provided by Markov Random Fields (MRF’s). Initially, we develop a MAP estimator of speech based on the Gaussian MRF prior. In the following, we introduce the Chi MRF, which is employed in the development of an improved speech estimator. Finally, the performance of fixed and adaptive schemes for the estimation of the MRF parameters is investigated.
Supervisor: White, Paul Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID: uk.bl.ethos.494716  DOI: Not available
Keywords: TK Electrical engineering. Electronics Nuclear engineering ; QC Physics
Share: