Sorry, you need to enable JavaScript to visit this website.

Speech Denoising by Parametric Resynthesis

Citation Author(s):
Soumi Maiti, Michael I Mandel
Submitted by:
Soumi Maiti
Last updated:
10 May 2019 - 3:35pm
Document Type:
Poster
Document Year:
2019
Event:
Presenters:
Michael Mandel
Paper Code:
4075
 

This work proposes the use of clean speech vocoder parameters
as the target for a neural network performing speech enhancement.
These parameters have been designed for text-to-speech
synthesis so that they both produce high-quality resyntheses
and also are straightforward to model with neural networks,
but have not been utilized in speech enhancement until now.
In comparison to a matched text-to-speech system that is given
the ground truth transcripts of the noisy speech, our model is
able to produce more natural speech because it has access to
the true prosody in the noisy speech. In comparison to two
denoising systems, the oracle Wiener mask and a DNN-based
mask predictor, our model equals the oracle Wiener mask in
subjective quality and intelligibility and surpasses the realistic
system. A vocoder-based upper bound shows that there
is still room for improvement with this approach beyond the
oracle Wiener mask. We test speaker-dependence with two
speakers and show that a single model can be used for multiple
speakers.

up
0 users have voted: