Use this URL to cite or link to this record in EThOS: https://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.742643
Title: Speech analysis using very low-dimensional bottleneck features and phone-class dependent neural networks
Author: Bai, Linxue
ISNI:       0000 0004 7230 832X
Awarding Body: University of Birmingham
Current Institution: University of Birmingham
Date of Award: 2018
Availability of Full Text:
Access from EThOS:
Access from Institution:
Abstract:
The first part of this thesis focuses on very low-dimensional bottleneck features (BNFs), extracted from deep neural networks (DNNs) for speech analysis and recognition. Very low-dimensional BNFs are analysed in terms of their capability of representing speech and their suitability for modelling speech dynamics. Nine-dimensional BNFs obtained from a phone discrimination DNN are shown to give comparable phone recognition accuracy to 39-dimensional MFCCs, and an average of 34% higher phone recognition accuracy than formant-based features of the same dimensions. They also preserve the trajectory continuity well and thus hold promise for modelling speech dynamics. Visualisations and interpretations of the BNFs are presented, with phonetically motivated studies of the strategies that DNNs employ to create these features. The relationships between BNF representations resulting from different initialisations of DNNs are explored. The second part of this thesis considers BNFs from the perspective of feature extraction. It is motivated by the observation that different types of speech sounds lend themselves to different acoustic analysis, and that the mapping from spectra-in-context to phone posterior probabilities implemented by the DNN is a continuous approximation to a discontinuous function. This suggests that it may be advantageous to replace the single DNN with a set of phone class dependent DNNs. In this case, the appropriate mathematical structure is a manifold. It is shown that this approach leads to significant improvements in frame level phone classification accuracy.
Supervisor: Not available Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID: uk.bl.ethos.742643  DOI: Not available
Keywords: TK Electrical engineering. Electronics Nuclear engineering
Share: