Use this URL to cite or link to this record in EThOS: http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.652960
Title: Phonetics of segmental FO and machine recognition of Korean speech
Author: Jang, Tae-Yeoub
Awarding Body: University of Edinburgh
Current Institution: University of Edinburgh
Date of Award: 2000
Availability of Full Text:
Access through EThOS:
Full text unavailable from EThOS. Please try the link below.
Access through Institution:
Abstract:
The main goal of the study is to improve performance of Korean automatic recognition by exploiting the fundamental frequency (F0) of vowels, which is affected by identity of the preceding consonant. The hypothesis is that if the vowel F0 is given, the identification of the consonant can be more accurate. The effect, which I will call the "segmental F0 effect", has been confirmed by a number of phonetic studies across various languages. Most frequently, the F0 value of a vowel has been suggested to be a cue to the voiced/voiceless distinction of the preceding consonant. In Korean, segmental F0 can be useful for differentiating the three typical manners (lax, tense, and aspirated) of stop and affricate articulation. Earlier phonetic studies have found that F0 of a vowel onset becomes higher after strong stops (eg., tense and aspirated sounds) and lower after lax stops. It is also suggested that this effect is more salient in Korean than European languages like English and French. If the segmental F0 effect is going to be helpful for speech recognition, it has to be detectable outside the carefully controlled data used for phonetic studies. I show that automatic measurements over a large amount of data can also capture the effect. Other related issues regarding segmental perturbation which have not been dealt with in earlier studies are also investigated. Integration of the segmental F0 effect with speech recognition is achieved using demisyllables as basic recognition units. As some demisyllables are composed of both an onset consonant and the front part of the nucleus, it is relatively easy for them to carry characteristics of the consonant-vowel relation, such as segmental F0, on their own. Besides, I find that an HMM demisyllable based recogniser performs better than a baseline HMM recogniser with phone-like units even before F0 is included. Thus, using demisyllables in Korean speech recognition has an independent motivation. In addition, a lexicon modification technique by pronunciation modelling is introduced to further enhance the recognition performance. I show that inclusion of F0 in the demisyllable recogniser gives further improvement in results.
Supervisor: Not available Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID: uk.bl.ethos.652960  DOI: Not available
Share: