Sorry, you need to enable JavaScript to visit this website.

MONAURAL SINGING VOICE SEPARATION WITH SKIP-FILTERING CONNECTIONS AND RECURRENT INFERENCE OF TIME-FREQUENCY MASK

Citation Author(s):
Stylianos Ioannis Mimilakis, Konstantinos Drossos, Joao Felipe Santos, Gerald Schuller, Tuomas Virtanen, Yoshua Bengio
Submitted by:
Stylianos Mimilakis
Last updated:
13 April 2018 - 9:32am
Document Type:
Poster
Document Year:
2018
Event:
Presenters:
Stylianos Ioannis Mimilakis
Paper Code:
2799
 

Singing voice separation based on deep learning relies on the usage of time-frequency masking. In many cases the masking process is not a learnable function or is not encapsulated into the deep learning optimization. Consequently, most of the existing methods rely on a post processing step using the generalized Wiener filtering. This work proposes a method that learns and optimizes (during training) a source-dependent mask and does not need the aforementioned post processing step. We introduce a recurrent inference algorithm, a sparse transformation step to improve the mask generation process, and a learned denoising filter. Obtained results show an increase of 0.49 dB for the signal to distortion ratio and 0.30 dB for the signal to interference ratio, compared to previous state-of-the-art approaches for monaural singing voice separation.

up
0 users have voted: