Sorry, you need to enable JavaScript to visit this website.

Correlated Tensor Factorization for Audio Source Separation

Citation Author(s):
Kazuyoshi Yoshii
Submitted by:
Kazuyoshi Yoshii
Last updated:
17 April 2018 - 1:39am
Document Type:
Poster
Document Year:
2018
Event:
Presenters:
Kazuyoshi Yoshii
Paper Code:
AASP-P12.1
 

This paper presents an ultimate extension of nonnegative matrix factorization (NMF) for audio source separation based on full covariance modeling over all the time-frequency (TF) bins of the complex spectrogram of an observed mixture signal. Although NMF has been widely used for decomposing an observed power spectrogram in a TF-wise manner, it has a critical limitation that the phase values of interdependent TF bins cannot be dealt with. This problem has been solved only partially by several phase-aware extensions of NMF that decompose an observed complex spectrogram in an time- and/or frequency-wise manner. In this paper, we propose correlated tensor factorization (CTF) that approximates the full covariance matrix over all TF bins as the sum of the Kronecker products between basis covariance matrices over frequency bands and the corresponding ones over time frames. All the TF bins of the complex spectrogram of each source signal are estimated jointly in an interdependent manner via Wiener filtering. We discuss how to reduce the computational cost of CTF and report the results of comparative evaluation of CTF with its special cases such as NMF and positive semidefinite tensor factorization (PSDTF).

up
0 users have voted: