Use this URL to cite or link to this record in EThOS: http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.404941
Title: Effects of visual degradation on audio-visual speech perception
Author: Sanderson, Mariana Welly.
Awarding Body: Birkbeck (University of London)
Current Institution: Birkbeck (University of London)
Date of Award: 2003
Availability of Full Text:
Access through EThOS:
Abstract:
Audio-visual speech recognition is considered to be a dynamic process that uses auditory and complementary visual speech cues. These are the products of the stream of timed and targeted movements of the articulators in the vocal tract used to produce speech. If the visual aspect of speech is absent or degraded, speech recognition in noise may deteriorate; this was used as a tool to investigate the visual aspect of speech recognition in the following experiments. A series of shadowing and recall experiments assessed the effects of frame rate (temporal) and greyscale level (spatial) variations to the visual aspect of audio-visual presentations of sentences spoken in noisy backgrounds by three, evenly illuminated, speakers. There was a significant decline in shadowing accuracy as the frame rate of presentation fell that was related to the importance of temporal synchrony in audiovisual speech. Shadowing and recall experiments, with recordings from one speaker in two illumination conditions and two greyscale levels, revealed that performance accuracy depended on level of illumination in both tasks, for the audio-visual experimental condition and the audio-alone control condition. Moreover in poor illumination, there was significantly less accurate recall performance at the lower greyscale level. This was related to level of spatial facial information that may be used in speech recognition. Shadowing and recall accuracy of sentence's keywords was related to their degree of visible speech-related movement. Audio-visual shadowing accuracy varied little across the range of movements, but audio-alone shadowing accuracy declined significantly as the degree of movement increased. Visual and auditory target characteristics of words associated with differing audio-visual advantage and degrees of visual movement were determined. The findings were considered in the context of a dynamic model of speech processing, which is dependent on patterns of the timings and targets of the auditory and visual speech signals.
Supervisor: Not available Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID: uk.bl.ethos.404941  DOI: Not available
Share: