Use this URL to cite or link to this record in EThOS: http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.639099
Title: A segmental mixture model, maximising data use with time sequence information
Author: Stapert, R. P.
Awarding Body: University of Wales Swansea
Current Institution: Swansea University
Date of Award: 2000
Availability of Full Text:
Access from EThOS:
Abstract:
Here, time sequence information is explored as a means of increasing the amount of speaker specific information to be gained from limited data. One of the popular approaches in speaker recognition at the time of writing is called Gaussian mixture modeling which does not use time sequence information as it is implemented here. In this thesis, an attempt is made to use time sequence information without any prior linguistic knowledge or labelling of the databases. This is achieved by embedding dynamic time warping into a Gaussian mixture model structure. The story that is told here covers the main points that need to be investigated in order to create a viable foundation for the inclusion of dynamic time warping in a Gaussian mixture model. The experimental results show that temporal constraints offer better speaker discrimination than unconstrained nearest neighbour decisions. It is also shown that using speech segments shorter than the actual utterance, in combination with dynamic time warping, can provide additional error reduction. This foundation work prompts the work on Gaussian mixture models, which reveals that the combination of dynamic time warping and Gaussian mixture models can improve identification results significantly in a text independent environment. The term segmental mixture model is used to identify the combination of two techniques. It is tested on twenty speakers of the BT Millar database, which is a multi-session digit database, and on one thousand speakers of the Welsh SpeechDat database, which is a large text independent database. In both instances the segmental mixture model demonstrates its potential for enhancing the discrimination between speakers.
Supervisor: Not available Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID: uk.bl.ethos.639099  DOI: Not available
Share: