Use this URL to cite or link to this record in EThOS: http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.656127
Title: Confusion modelling for lip-reading
Author: Howell, Dominic
ISNI:       0000 0004 5347 105X
Awarding Body: University of East Anglia
Current Institution: University of East Anglia
Date of Award: 2015
Availability of Full Text:
Access from EThOS:
Access from Institution:
Abstract:
Lip-reading is mostly used as a means of communication by people with hearing di�fficulties. Recent work has explored the automation of this process, with the aim of building a speech recognition system entirely driven by lip movements. However, this work has so far produced poor results because of factors such as high variability of speaker features, diffi�culties in mapping from visual features to speech sounds, and high co-articulation of visual features. The motivation for the work in this thesis is inspired by previous work in dysarthric speech recognition [Morales, 2009]. Dysathric speakers have poor control over their articulators, often leading to a reduced phonemic repertoire. The premise of this thesis is that recognition of the visual speech signal is a similar problem to recog- nition of dysarthric speech, in that some information about the speech signal has been lost in both cases, and this brings about a systematic pattern of errors in the decoded output. This work attempts to exploit the systematic nature of these errors by modelling them in the framework of a weighted finite-state transducer cascade. Results indicate that the technique can achieve slightly lower error rates than the conventional approach. In addition, it explores some interesting more general questions for automated lip-reading.
Supervisor: Not available Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID: uk.bl.ethos.656127  DOI: Not available
Share: