Use this URL to cite or link to this record in EThOS: http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.702787
Title: Speech enhancement in the modulation domain
Author: Wang, Yu
Awarding Body: Imperial College London
Current Institution: Imperial College London
Date of Award: 2015
Availability of Full Text:
Access from EThOS:
Full text unavailable from EThOS. Please try the link below.
Access from Institution:
Abstract:
The goal of a speech enhancement algorithm is to reduce or eliminate background noise without distorting the speech signal. Although speech enhancement is important for practical scenarios, it is a difficult task especially when the noisy speech signal is only available from a single channel. Although many single-channel speech algorithms have been proposed that can improve the Signal-to-Noise Ratio (SNR) of the noisy speech, in some cases dramatically, they also introduce speech distortion and spurious tonal artefacts known as musical noise. There has been evidence, both physiological and psychoacoustic, to support the significance of the modulation domain, i.e. the temporal modulation of the acoustic spectral components, to speech enhancement. In this thesis three methods for implementing single-channel speech enhancement in the modulation domain have been proposed. The goal in all three cases is to take advantage of prior knowledge about the temporal modulation of short-time spectral amplitudes. The first method is to post-process the output of a conventional single-channel speech enhancement algorithm using a modulation domain Kalman filter. The second method performs enhancement directly in the modulation domain based on the assumption that the temporal sequence of spectral amplitudes within each frequency bin lies within a low dimensional subspace. The third method uses a modulation-domain Kalman filter to perform enhancement using two alternative distribution families for the speech and noise amplitude prior distributions. The performance of the proposed enhancement algorithms is assessed by measuring the SNR and speech quality (using the Perceptual Evaluation of Speech Quality (PESQ) metric) of the enhanced speech. It is found that, for a range of noise types, the proposed algorithms give consistent improvements in both metrics.
Supervisor: Brookes, Mike Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID: uk.bl.ethos.702787  DOI: Not available
Share: