Use this URL to cite or link to this record in EThOS:
Title: Modelling asynchrony in the articulation of speech for automatic speech recognition
Author: Wilkinson, Nicholas.
ISNI:       0000 0001 3568 4893
Awarding Body: University of Birmingham
Current Institution: University of Birmingham
Date of Award: 2003
Availability of Full Text:
Access from EThOS:
Current automatic speech recognition systems make the assumption that all the articulators in the vocal tract move in synchrony with one another to produce speech. This thesis describes the development of a more realistic model that allows some asynchrony between the articulators with the aim of improving speech recognition accuracy. Experiments on the TEVHT database demonstrate that higher phone recognition accuracy is obtained by separate modelling of the voiced and voiceless components of speech by splitting the speech spectrum into high and low frequency bands. To model further articulator asynchrony in speech production requires a representation of speech that is closer to the actual production process. Formant frequency parameters are integrated into typical Mel-frequency cepstral coefficient representation and their effect on recognition accuracy observed. The formant frequency estimates can only accurately be made when the formants are visible in the spectrum, so a technique is developed to ignore frequency estimates generated when the formants are not visible. The formant data allows a unique method of vocal tract normalization, which improves recognition accuracy. Finally a classification experiment examines the potential improvement in speech recognition accuracy of modelling asynchrony between the articulators by allowing asynchrony between all the formants.
Supervisor: Not available Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available