Use this URL to cite or link to this record in EThOS: http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.505787
Title: Informing multisource decoding in robust automatic speech recognition
Author: Ma, Ning
Awarding Body: The University of Sheffield
Current Institution: University of Sheffield
Date of Award: 2009
Availability of Full Text:
Access from EThOS:
Abstract:
Listeners are remarkably adept at recognising speech in natural multisource environments, while most Automatic Speech Recognition (ASR) technology fails in these conditions. It has been proposed that this human ability is governed by Auditory Scene Analysis (ASA) processes, in which a sound mixture is segregated into perceptual packages, called 'streams', by a combination of bottom-up and top-down processing. A range of small-vocabulary speech recognition experiments are conducted for evaluation. This thesis examines a novel ASR framework based on the ASA account, Speech Fragment ding (SFD). A 'fragment' is a spectro-temporal region where energy from a single sound source dominates. SFD employs techniques developed from knowledge about the auditory system to identify fragments. A decoding process using statistical speech models is applied to the fragment representation to simultaneously identify speech evidence and recognise speech.
Supervisor: Not available Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID: uk.bl.ethos.505787  DOI: Not available
Share: