Use this URL to cite or link to this record in EThOS:
Title: Usability engineering of surname capture strategies in automated telephony and multimodal spoken language dialogue services
Author: Davidson, Nancie
Awarding Body: University of Edinburgh
Current Institution: University of Edinburgh
Date of Award: 2007
Availability of Full Text:
Access from EThOS:
Full text unavailable from EThOS. Please try the link below.
Access from Institution:
Surname capture via automatic speech recognition has many potential applications, including automated directory assistance and travel reservation services. It is, however, a difficult challenge, firstly because of the large set of names involved in many of the potential applications and secondly because of the lack of standardised pronunciations for many of these names. Previous work has explored a variety of approaches to proper name recognition but has focused on recognition accuracy alone, with few attempts to assess user reaction to the various strategies investigated. The work presented in this thesis addresses this by examining the problem of automated surname capture from a user perspective. In doing so it seeks to advance knowledge in the field of spoken language dialogue services through examination of a particular problem that nonetheless has wide applicability. Data from a series of three controlled experiments are presented in which the usability of three different strategies for surname capture is empirically evaluated in both automated telephony and multimodal contexts. The focus of the multimodal work is on spoken language dialogue services in which graphical output is employed in the form of an embodied conversational agent. The underlying thesis of the work is that, through careful dialogue engineering, automated surname capture using current speech recognition technology (and by extension other proper name tasks) can be made highly usable. The evaluation methodology employed throughout provides both quantitative and qualitative data on user attitudes, together with objective measures of performance. It thus provides a comprehensive measure of usability that is missing not only from work on proper name recognition, but from the wider field of spoken language dialogue services as a whole. In particular, systematic usability evaluations of embodied conversational agent applications in which realistic speech recognition is employed are rare, hence the need for the evaluations presented here. The results show that in order to achieve a high level of usability the use of spelling information is vital in strategies for automated surname capture. This is true in both automated telephony contexts and multimodal interfaces of the type examined in the research. Moreover, where text output is available this can also improve the usability of the process.
Supervisor: Not available Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available