Sorry, you need to enable JavaScript to visit this website.

ICASSP 2019 presentation slides

Citation Author(s):
Matthias Zöhrer, Franz Pernkopf
Submitted by:
Lukas Pfeifenberger
Last updated:
28 April 2021 - 6:26am
Document Type:
Presentation Slides
Document Year:
2019
Presenters:
Lukas Pfeifenberger
Paper Code:
2996
 

We propose a complex-valued deep neural network (cDNN) for speech enhancement and source separation. While existing end-to-end systems use complex-valued gradients to pass the training error to a real-valued DNN used for gain mask estimation, we use the full potential of complex-valued LSTMs, MLPs and activation functions to estimate complex-valued beamforming weights directly from complex-valued microphone array data. By doing so, our cDNN is able to locate and track different moving sources by exploiting the phase information in the data. In our experiments, we use a typical living room environment, mixtures of the WallStreet Journal corpus, and YouTube noise. We compare our cDNN against the BeamformIt toolkit as a baseline, and a mask-based beamformer as a state-of-the-art reference system. We observed a significant improvement in terms of PESQ, STOI and WER.

up
0 users have voted: