Use this URL to cite or link to this record in EThOS: http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.661365
Title: Nasality in automatic speaker verification
Author: Rooney, Edmund Joseph
Awarding Body: University of Edinburgh
Current Institution: University of Edinburgh
Date of Award: 1990
Availability of Full Text:
Access through EThOS:
Full text unavailable from EThOS. Please try the link below.
Access through Institution:
Abstract:
This thesis examines the suitability of nasal resonance patterns as a means of authenticating speakers' identities in an automatic speaker verification system. The inadequacy of traditional methods of ascertaining identity in commerce and industry - the possession of keys or PIN numbers, for example - has prompted researchers to look at attributes which are inseparable from the person who possesses them ('biometric' features: that is, features which are part of a person's physical make-up, or aspects of their performance of a task). The use of speech in this application has received much attention, despite its inherent variability. Much of the research uses whole-word templates (text-dependent) or long-term statistical measures (text-independent), but a third approach - segmental analysis - has also proved useful, because it concentrates on features of speech which are known to be highly speaker-dependent. The nasal cavities in particular are known to vary considerably from speaker to speaker, and to be relatively fixed in their size and shape. The acoustic analysis of nasality is complicated by the manner of its production, however, which introduces anti-resonances or transfer function zeros into the spectrum. This renders the most popular analysis technique, Linear Predictive Coding, inherently inaccurate, since it assumes a vocal tract transfer function which has all poles (resonances) and no zeros. In this thesis, the potential of nasality is re-examined using a relatively new but established technique, cepstral decomposition, which allows accurate estimation of both pole and zero frequencies. The efficacy of this technique is demonstrated on both synthetic speech and nasal stops, and a modification is introduced to reduce the detrimental effects of overestimation of the all-zero model order. A review of acoustic, anatomical and phonetic aspects of nasality suggests that while nasality does not offer an invariant acoustic marker of identity (the nasal tract proving extremely variable and its contribution to the spectrum depending extensively on the rest of the vocal tract), it still offers the most favourable phonetic environment for the purposes of speaker verification. The velar nasal stop is chosen for study, since its spectrum shows the greatest dependence on unalterable nasal tract characteristics and the greatest resistance to changes elsewhere in the vocal tract (e.g. lingual coarticulation).
Supervisor: Not available Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID: uk.bl.ethos.661365  DOI: Not available
Share: