Correlated Tensor Factorization for Audio Source Separation

This paper presents an ultimate extension of nonnegative matrix factorization (NMF) for audio source separation based on full covariance modeling over all the time-frequency (TF) bins of the complex spectrogram of an observed mixture signal. Although NMF has been widely used for decomposing an observed power spectrogram in a TF-wise manner, it has a critical limitation that the phase values of interdependent TF bins cannot be dealt with. This problem has been solved only partially by several phase-aware extensions of NMF that decompose an observed complex spectrogram in an time- and/or frequency-wise manner. In this paper, we propose correlated tensor factorization (CTF) that approximates the full covariance matrix over all TF bins as the sum of the Kronecker products between basis covariance matrices over frequency bands and the corresponding ones over time frames. All the TF bins of the complex spectrogram of each source signal are estimated jointly in an interdependent manner via Wiener filtering. We discuss how to reduce the computational cost of CTF and report the results of comparative evaluation of CTF with its special cases such as NMF and positive semidefinite tensor factorization (PSDTF).

icassp-2018-yoshii-poster.pdf

icassp-2018-yoshii-poster.pdf (554)

Thumbs Up

CITE

Documents

Poster

Correlated Tensor Factorization for Audio Source Separation

icassp-2018-yoshii-poster.pdf

QUESTIONS?