Use this URL to cite or link to this record in EThOS: http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.337971
Title: Identification and correction of speech repairs in the context of an automatic speech recognition system
Author: Johnson, Kevin
ISNI:       0000 0001 2411 4262
Awarding Body: Durham University
Current Institution: Durham University
Date of Award: 1997
Availability of Full Text:
Access through EThOS:
Access through Institution:
Abstract:
Recent advances in automatic speech recognition systems for read (dictated) speech have led researchers to confront the problem of recognising more spontaneous speech. A number of problems, such as disfluencies, appear when read speech is replaced with spontaneous speech. In this work we deal specifically with what we class as speech-repairs. Most disfluency processes deal with speech-repairs at the sentence level. This is too late in the process of speech understanding. Speech recognition systems have problems recognising speech containing speech-repairs. The approach taken in this work is to deal with speech-repairs during the recognition process. Through an analysis of spontaneous speech the grammatical structure of speech- repairs was identified as a possible source of information. It is this grammatical structure, along with some pattern matching to eliminate false positives, that is used in the approach taken in this work. These repair structures are identified within a word lattice and when found result in a SKIP being added to the lattice to allow the reparandum of the repair to be ignored during the hypothesis generation process. Word fragment information is included using a sub-word pattern matching process and cue phrases are also identified within the lattice and used in the repair detection process. These simple, yet effective, techniques have proved very successful in identifying and correcting speech-repairs in a number of evaluations performed on a speech recognition system incorporating the repair procedure. On an un-seen spontaneous lecture taken from the Durham corpus, using a dictionary of 2,275 words and phoneme corruption of 15%, the system achieved a correction recall rate of 72% and a correction precision rate of 75%.The achievements of the project include the automatic detection and correction of speech-repairs, including word fragments and cue phrases, in the sub-section of an automatic speech recognition system processing spontaneous speech.
Supervisor: Not available Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID: uk.bl.ethos.337971  DOI: Not available
Keywords: Computer software & programming Computer software Pattern recognition systems Pattern perception Image processing
Share: