Use this URL to cite or link to this record in EThOS:
Title: Efficient feature extraction based on two-dimensional cepstrum analysis for speech recognition
Author: Marvi, Hossein
ISNI:       0000 0001 3620 4716
Awarding Body: University of Surrey
Current Institution: University of Surrey
Date of Award: 2004
Availability of Full Text:
Access from EThOS:
Access from Institution:
Solving speech recognition problems requires an adequate feature extraction technique to transform the raw speech signal to a set of feature vectors to preserve most of information corresponding to the speech signal. The features should ideally be compact, distinct and well representative of the speech signal. If the feature vectors do not represent the important content of the speech, the performance of the system will perform poorly regardless of the pattern recognition techniques applied. Many different feature extraction representations of the speech signal have been suggested and tried for speech recognition. The most popular features which are used currently are Mel- frequency cesptral coefficients (MFCC) and perceptual linear prediction (PLP), which are based on one dimensional cepstrum analysis. The two dimensional cepstrum (TDC) is an alternative approach for time-frequency representation of any speech signal which can preserve both the instantaneous and transitional information of the speech signal. Here, in this thesis, the principle aim concerns the study of the two dimensional cepstrum analysis as a feature extraction technique for speech recognition. A novel feature extraction technique, two dimensional root cepstrum (TDRC) is also introduced. It has the advantage of an adjustable y parameter which can be used to optimise the feature extraction process, reducing the dimensions of the feature matrix and giving simple computation. In addition, the Mel TDRC has been proposed as a modified method of original TDRC to improve the accuracy. It is shown that both the TDC and the TDRC outperform the conventional cepstrum. To preserve both magnitude and phase details of the speech signal simultaneously in a feature matrix, the Hartley transform (HT) is suggested as a substitute for the Fourier transform (FT) in two-dimensional cepstrum analysis. Experimental results demonstrate the enhanced capability of the HT in the two dimensional root cepstral analysis to improve recognition accuracy. An experimental comparative study of 9 kinds of feature extraction methods based on cepstral analysis are also carried out.
Supervisor: Not available Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available