Use this URL to cite or link to this record in EThOS:
Title: Cross-relation based blind identification of acoustic SIMO systems and applications
Author: Hu, Mathieu
ISNI:       0000 0004 6422 8258
Awarding Body: Imperial College London
Current Institution: Imperial College London
Date of Award: 2017
Availability of Full Text:
Access from EThOS:
Access from Institution:
Speech signals captured by microphones placed at a distance from the speaker are cor- rupted by reverberation, i.e. sound waves reflected off hard surfaces such as walls and objects. The spectral distortion caused by reverberation drastically decreases the perfor- mance of automatic speech recognition systems and may degrade the intelligibility and the quality of speech for human listeners. The increased use of devices controlled by distant speech therefore induces the need for dereverberation. A possible approach to dereverberation is that of system equalization, which consists of the blind estimation of the room impulse responses from noisy reverberant signals followed by an inversion of these impulse responses. This thesis investigates the first part of this two-stage approach. The cross-relation method is adopted and exploited in two different ways. The first way follows the adaptive filter framework, which was first introduced in the context of blind identification of room impulses responses in the Multi-Channel Least Mean Square. By considering a block update of this stochastic gradient algorithm, a noise robust algorithm is developed. The convergence rate of the resulting algorithm is then increased by using a locally optimal adaptive step-size. The cross-relation, expressed in the frequency domain, is then shown to contain the transfer function relating any of the microphone to a reference microphone. This relative transfer function can be used to reduce the number of variables to be estimated. However, the performance of the previous methods severely degrades when realisti- cally long room impulse responses are considered. An alternative interpretation of the cross-relation, from an annihilation filter perspective, is therefore explored. The resulting algorithm is shown to be able to estimate room impulse responses of thousands of taps. From a more practical perspective, the use of room impulses estimated at a poor accuracy is investigated for the problem of speaker diarization. The spatial information captured in the direct-to-reverberant ratio is shown to be robust to high levels of errors in the estimated room impulse responses. Blindly estimated direct-to-reverberant ratios combined with speech features in a single-channel diarization system are shown to provide additional information, which improves the performance of the diarization system.
Supervisor: Naylor, Patrick A. ; Brookes, Mike Sponsor: European Union
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral