Sorry, you need to enable JavaScript to visit this website.

A FULLY CONVOLUTIONAL NEURAL NETWORK FOR COMPLEX SPECTROGRAM PROCESSING IN SPEECH ENHANCEMENT

Citation Author(s):
Zhiheng Ouyang, Hongjiang Yu, Wei-Ping Zhu, Benoit Champagne
Submitted by:
Hongjiang Yu
Last updated:
9 May 2019 - 5:25pm
Document Type:
Presentation Slides
Document Year:
2019
Event:
Presenters:
Hongjiang Yu
Paper Code:
2412
 

In this paper we propose a fully convolutional neural network (CNN) for complex spectrogram processing in speech enhancement.
The proposed CNN consists of frequency-dilated two-dimensional (2-d) convolution and 1-d convolution, and incorporates a residual learning and skip-connection structure. Compared with the state of the arts, the proposed CNN achieves a better performance with fewer parameters. Experiments have shown that the complex spectrogram processing is effective in terms of phase estimation, which has benefited the reconstruction of clean speech especially in the female speech case. It is also demonstrated that the model yields a convincing performance with small memory footprint when the number of parameters is limited

up
0 users have voted: