Use this URL to cite or link to this record in EThOS: https://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.772882
Title: Automatic framework to aid therapists to diagnose children who stutter
Author: Alharbi, Sadeen
Awarding Body: University of Sheffield
Current Institution: University of Sheffield
Date of Award: 2018
Availability of Full Text:
Access from EThOS:
Access from Institution:
Abstract:
This thesis studies the feasibility of developing an automated framework that provides an indication of the severity level of stuttering for children who stutter. Diagnosing this condition early is extremely important to ensure adequate therapeutic treatment while a child is still young. However, correct diagnoses depend heavily on the availability of expert therapists and on their painstaking manual work to record, transcribe and then count different kinds of speech disfluency. Where such expertise is rare, an automated framework would be immensely helpful. The main challenge facing the development of such a system is the scarcity of available training data. Whereas speech corpora of children's speech are limited, corpora of children's stuttering speech are extremely limited, such that certain direct machine-learning techniques could not be used, for lack of training data. Furthermore, the best available stuttering data set is not fully transcribed. Our proposed solution deploys a number of approaches that make best use of all the available data. We combine a standard children's corpus (PF-Star) with a small corpus of stuttering speech (UCLASS), the latter which we transcribe. We focus on automatic speech recognition (ASR) methods to decode stuttering utterances, making best use of frequency information to segment the speech into words and part-words. We compare two methods, one based on training a conventional ASR whose language model is augmented with extra stuttering events, such as pseudo-words and repetitions; and the other based on task-specific lattices derived from the original reading prompt, with manually-tuned stuttering arcs. We then focus on autocorrelation methods to detect cases of prolongation, making best use of time information, in relation to the subject's speaking rate. We also investigate a number of approaches to identifying different kinds of stuttering event in transcriptions (both manual and produced by ASR), comparing conditional random field (CRF) and bi-directional long short-term memory (BLSTM) detectors. Finally, we use an algorithm based on Guitar, Yairi and Ambrose's classification of stuttering severity, to partition all subjects into three classes: normal disfluency, borderline stuttering and beginning stuttering. The resulting diagnoses were evaluated by comparing them against diagnoses made by two UK-registered speech language therapists, and ranged from 72% to 100% accurate (after tuning). The benefits of the work include: supporting speech therapists by automatically transcribing recorded sessions, identifying stuttering events and providing an accurate early indication of a diagnosis. This could help children receive suitable therapy, commensurate with the severity of their stuttering. If deployed in the Cloud, this work could help in the remote diagnosis of children in areas where expertise is limited. In this setting, research benefits would include the further collection of stuttering data.
Supervisor: Simons, Anthony Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID: uk.bl.ethos.772882  DOI: Not available
Share: