Use this URL to cite or link to this record in EThOS: http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.659596
Title: A phonetically transparent technique for the automatic transcription of speech
Author: Morony, Michael J.
Awarding Body: University of Edinburgh
Current Institution: University of Edinburgh
Date of Award: 1998
Availability of Full Text:
Access through EThOS:
Full text unavailable from EThOS. Please try the link below.
Access through Institution:
Abstract:
This thesis describes a technique for automatic transcription of speech that is both probabilistic and phonetically transparent. The technique employs classical methods of statistical pattern-recognition, but works within a phonetic framework that is simple to conceptualise, with no "black box" components or hidden states or layers. The technique being statistical, it involves a prior training-phase during which the features of the classes to be recognised are learnt from examples, and statistical models estimated for each class. These models subsequently become the basis of the recognition procedure. In the present implementation, the classes are classes of "sub-phonic" entities (where "sub-phonic" denotes parts of phones and phones are conceived as artificially isolated sections of the acoustic record of speech, each phone serving an identifiable linguistic function). The sub-phonic entities (or 'subphones') are defined in such a way as to take account of the underlying articulatory reality, and the manner of their definition imposes a significant amount of constraint on the search for the most probable utterance-transcription, a search executed using dynamic programming. The subphones are identified explicitly and modelled statistically in their own right. Explicit identification and modelling result in greater simplicity than is typical of other statistical modelling techniques. No use is made of durational features in their identification either in training or (in the basic implementation of the technique) in recognition. It is argued that duration should be modelled primarily not as a feature of phonetic classes, but rather of higher level structures, though it is also suggested that subphonic analysis of phones may provide a basis for relativistic within-phone duration-modelling.
Supervisor: Not available Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID: uk.bl.ethos.659596  DOI: Not available
Share: