Use this URL to cite or link to this record in EThOS: http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.429911
Title: Robust text-independent speaker recognition over telecommunications systems
Author: Uzuner, Halil
ISNI:       0000 0001 3541 5917
Awarding Body: University of Surrey
Current Institution: University of Surrey
Date of Award: 2006
Availability of Full Text:
Access from EThOS:
Access from Institution:
Abstract:
Biometric recognition methods, using human features such as voice, face or fingeorprints, are increasingly popular for user authentication. Voice is unique in that it is a non-intrusive biometric which can be transmitted over the existing telecommunication networks, thereby allowing remote authentication. Current spealcer recognition systems can provide high recognition rates on clean speech signals. However, their performance has been shown to degrade in real-life applications such as telephone banking, where speech compression and background noise can affect the speech signal. In this work, three important advancements have been introduced to improve the speaker recognition performance, where it is affected by the coder mismatch, the aliasing distortion caused by the Line Spectral Frequency (LSF) parameter extraction, and the background noise. The first advancement focuses on investigating the speaker recognition system performance in a multi-coder environment using a Speech Coder Detection (SCD) System, which minimises training and testing data mismatch and improves the speaker recognition performance. Having reduced the speaker recognition error rates for multi-coder environment, further investigation on GSM-EFR speech coder is performed to deal with a particular - problem related to LSF parameter extraction method. It has been previously shown that the classic technique for extraction of LSF parameters in speech coders is prone to aliasing distortion. Low-pass filtering on up-sampled LSF vectors has been shown to alleviate this problem, therefore improving speech quality. In this thesis, as a second advancement, the Non-Aliased LSF (NA-LSF) extraction method is introduced in order to reduce the unwanted effects of GSM-EFR coder on speaker recognition performance. Another important factor that effects the performance of speaker recognition systems is the presence of the background noise. Background noise might severely reduce the performance of the targeted application such as quality of the coded speech, or the performance of the speaker recognition systems. The third advancement was achieved by using a noise-canceller to improve the speaker recognition performance in mismatched environments with varying background noise conditions. Speaker recognition system with a Minimum Mean Square Error - Log Spectral Amplitudes (MMSE-LSA) noise- canceller used as a pre-processor is proposed and investigated to determine the efficiency of noise cancellation on the speaker recognition performance using speech corrupted by different background noise conditions. Also the effects of noise cancellation on speaker recognition performance using coded noisy speech have been investigated. Key words; Identification, Verification, Recognition, Gaussian Mixture Models, Speech Coding, Noise Cancellation.
Supervisor: Not available Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID: uk.bl.ethos.429911  DOI: Not available
Share: