Use this URL to cite or link to this record in EThOS: http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.637460
Title: First investigation and experiments in Welsh speech recognition
Author: Jones, R. J.
Awarding Body: University of Wales Swansea
Current Institution: Swansea University
Date of Award: 2003
Availability of Full Text:
Access through EThOS:
Abstract:
The work described in this thesis aims to efficiently develop automatic speech recognition (ASR) in languages for which no such technology currently exists. The focus is on minority languages such as Welsh. An overview of the challenges of ASR development in lesser-spoken languages is presented. The specification of a 2000-speaker database for Welsh is described, with special reference to a novel lexicon searching process for Celtic minority languages. The collection of this database is also detailed, along with an analysis of the pitfalls facing those collecting minority language speech resources for ASR, and ways to overcome them. ASR is carried out on a small subset of the Welsh database (no more than 350 male speakers uttering one isolated digit each). This simulates a worst-case scenario for a language having only limited funds for a database collection. It is found that for Welsh, together with English and German, no more than 100-125 training speakers are required to reach a point beyond which the improvement in recognition performance is logarithmic. To reduce the number of training speakers beyond which this logarithmic improvement occurs, a model combination method similar to Yoshizawa et al.'s (Eurospeech 2001) is investigated. Model combination is achieved by creating a phonetic map between Welsh and German, in a manner similar to that proposed by Dalsgaard et al. (Eurospeech 1991). Use of the model combination method reduces the point at which logarithmic improvement occurs from 100-125 to 75-100 speakers.
Supervisor: Not available Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID: uk.bl.ethos.637460  DOI: Not available
Share: