Use this URL to cite or link to this record in EThOS:
Title: Cochlea modelling and its application to speech processing
Author: Pan, Shuokai
ISNI:       0000 0004 7656 3774
Awarding Body: University of Southampton
Current Institution: University of Southampton
Date of Award: 2018
Availability of Full Text:
Access from EThOS:
Full text unavailable from EThOS. Please try the link below.
Access from Institution:
Models of the cochlea provide a valuable tool for both better understanding its mechanics and also as an inspiration for many speech processing algorithms. Realistic modelling of the cochlea can be computationally demanding, however, which limits its applicability in signal processing applications. To mitigate this issue, an efficient numerical method has been proposed for performing time domain simulations, based on a nonlinear state space formulation. This model has then been contrasted with another type of cochlear model, that is established from a cascade of digital filters. A comparison of the responses from these two models has been conducted, in terms of their realism in simulating the measured nonlinear cochlear response to single tones and pairs of tones. Guided by these results, the filter cascade model is chosen for subsequent signal processing applications because it is significantly more efficient than the state space model, while still producing realistic responses. Using this nonlinear filter cascade model as a front-end, two speech processing tasks have been investigated: voice activity detection and supervised speech separation. Both tasks are tackled within a machine learning framework, in which a neural network is trained to reproduce target outputs. The results are compared with those using a number of other simpler auditory-inspired analysis methods. Simulation results show that although the nonlinear filter cascade model can be more effective in many testing scenarios, its relative advantage against other analysis methods is small. The incorporation of temporal context information and network structure engineering are found to be more important in improving the performance of these tasks. Once a suitable context expansion strategy has been selected, the difference between various front-end processing methods considered is marginal.
Supervisor: Elliott, Stephen Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available