Use this URL to cite or link to this record in EThOS: https://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.781281
Title: Binaural sound source localization using machine learning with spiking neural networks features extraction
Author: Al-Abboodi, H. M. A.
ISNI:       0000 0004 7966 911X
Awarding Body: University of Salford
Current Institution: University of Salford
Date of Award: 2019
Availability of Full Text:
Access from EThOS:
Access from Institution:
Abstract:
Human and animal binaural hearing systems are able take advantage of a variety of cues to localise sound-sources in a 3D space using only two sensors. This work presents a bionic system that utilises aspects of binaural hearing in an automated source localisation task. A head and torso emulator (KEMAR) are used to acquire binaural signals and a spiking neural network is used to compare signals from the two sensors. The firing rates of coincidence-neurons in the spiking neural network model provide information as to the location of a sound source. Previous methods have used a winner-takesall approach, where the location of the coincidence-neuron with the maximum firing rate is used to indicate the likely azimuth and elevation. This was shown to be accurate for single sources, but when multiple sources are present the accuracy significantly reduces. To improve the robustness of the methodology, an alternative approach is developed where the spiking neural network is used as a feature pre-processor. The firing rates of all coincidence-neurons are then used as inputs to a Machine Learning model which is trained to predict source location for both single and multiple sources. A novel approach that applied spiking neural networks as a binaural feature extraction method was presented. These features were processed using deep neural networks to localise multisource sound signals that were emitted from different locations. Results show that the proposed bionic binaural emulator can accurately localise sources including multiple and complex sources to 99% correctly predicted angles from single-source localization model and 91% from multi-source localization model. The impact of background noise on localisation performance has also been investigated and shows significant degradation of performance. The multisource localization model was trained with multi-condition background noise at SNRs of 10dB, 0dB, and -10dB and tested at controlled SNRs. The findings demonstrate an enhancement in the model performance in compared with noise free training data.
Supervisor: Not available Sponsor: Ministry of Higher Education, Iraq
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID: uk.bl.ethos.781281  DOI: Not available
Share: