Use this URL to cite or link to this record in EThOS: http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.728355
Title: Underdetermined convolutive source separation using two dimensional non-negative factorization techniques
Author: Al Tmeme, Ahmed Sattar Hadi
ISNI:       0000 0004 6499 8148
Awarding Body: Newcastle University
Current Institution: University of Newcastle upon Tyne
Date of Award: 2017
Availability of Full Text:
Access from EThOS:
Access from Institution:
Abstract:
In this thesis the underdetermined audio source separation has been considered, that is, estimating the original audio sources from the observed mixture when the number of audio sources is greater than the number of channels. The separation has been carried out using two approaches; the blind audio source separation and the informed audio source separation. The blind audio source separation approach depends on the mixture signal only and it assumes that the separation has been accomplished without any prior information (or as little as possible) about the sources. The informed audio source separation uses the exemplar in addition to the mixture signal to emulate the targeted speech signal to be separated. Both approaches are based on the two dimensional factorization techniques that decompose the signal into two tensors that are convolved in both the temporal and spectral directions. Both approaches are applied on the convolutive mixture and the high-reverberant convolutive mixture which are more realistic than the instantaneous mixture. In this work a novel algorithm based on the nonnegative matrix factor two dimensional deconvolution (NMF2D) with adaptive sparsity has been proposed to separate the audio sources that have been mixed in an underdetermined convolutive mixture. Additionally, a novel Gamma Exponential Process has been proposed for estimating the convolutive parameters and number of components of the NMF2D/ NTF2D, and to initialize the NMF2D parameters. In addition, the effects of different window length have been investigated to determine the best fit model that suit the characteristics of the audio signal. Furthermore, a novel algorithm, namely the fusion K models of full-rank weighted nonnegative tensor factor two dimensional deconvolution (K-wNTF2D) has been proposed. The K-wNTF2D is developed for its ability in modelling both the spectral and temporal changes, and the spatial covariance matrix that addresses the high reverberation problem. Variable sparsity that derived from the Gibbs distribution is optimized under the Itakura-Saito divergence and adapted into the K-wNTF2D model. The tensors of this algorithm have been initialized by a novel initialization method, namely the SVD two-dimensional deconvolution (SVD2D). Finally, two novel informed source separation algorithms, namely, the semi-exemplar based algorithm and the exemplar-based algorithm, have been proposed. These algorithms based on the NMF2D model and the proposed two dimensional nonnegative matrix partial co-factorization (2DNMPCF) model. The idea of incorporating the exemplar is to inform the proposed separation algorithms about the targeted signal to be separated by initializing its parameters and guide the proposed separation algorithms. The adaptive sparsity is derived for both ii of the proposed algorithms. Also, a multistage of the proposed exemplar based algorithm has been proposed in order to further enhance the separation performance. Results have shown that the proposed separation algorithms are very promising, more flexible, and offer an alternative model to the conventional methods.
Supervisor: Not available Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID: uk.bl.ethos.728355  DOI: Not available
Share: