Use this URL to cite or link to this record in EThOS:
Title: Exploiting phonological constraints and automatic identification of speaker classes for Arabic speech recognition
Author: Alsharhan, Iman
ISNI:       0000 0004 5352 6874
Awarding Body: University of Manchester
Current Institution: University of Manchester
Date of Award: 2014
Availability of Full Text:
Access from EThOS:
Access from Institution:
The aim of this thesis is to investigate a number of factors that could affect the performance of an Arabic automatic speech understanding (ASU) system. The work described in this thesis belongs to the speech recognition (ASR) phase, but the fact that it is part of an ASU project rather than a stand-alone piece of work on ASR influences the way in which it will be carried out. Our main concern in this work is to determine the best way to exploit the phonological properties of the Arabic language in order to improve the performance of the speech recogniser. One of the main challenges facing the processing of Arabic is the effect of the local context, which induces changes in the phonetic representation of a given text, thereby causing the recognition engine to misclassifiy it. The proposed solution is to develop a set of language-dependent grapheme-to-allophone rules that can predict such allophonic variations and eventually provide a phonetic transcription that is sensitive to the local context for the ASR system. The novel aspect of this method is that the pronunciation of each word is extracted directly from a context-sensitive phonetic transcription rather than a predened dictionary that typically does not reect the actual pronunciation of the word. Besides investigating the boundary effect on pronunciation, the research also seeks to address the problem of Arabic's complex morphology. Two solutions are proposed to tackle this problem, namely, using underspecified phonetic transcription to build the system, and using phonemes instead of words to build the hidden markov models (HMMS). The research also seeks to investigate several technical settings that might have an effect on the system's performance. These include training on the sub-population to minimise the variation caused by training on the main undifferentiated population, as well as investigating the correlation between training size and performance of the ASR system.
Supervisor: Ramsay, Allan Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available
Keywords: Modern standard Arabic ; speech processing ; automatic speech recognition ; phonological rules ; sound-spelling correspondences