Sorry, you need to enable JavaScript to visit this website.

RENet: A Time-Frequency Domain General Speech Restoration Network for ICASSP 2024 Speech Signal Improvement Challenge

Citation Author(s):
Fengyuan Hao, Huiyong Zhang, Lingling Dai, Xiaoxue Luo, Xiaodong Li, Chengshi Zheng
Submitted by:
Fengyuan Hao
Last updated:
12 April 2024 - 5:08am
Document Type:
Presentation Slides
Event:
Presenters:
Fengyuan Hao
Paper Code:
GC-L11.4
 

The ICASSP 2024 Speech Signal Improvement (SSI) Challenge seeks to address speech quality degradation problems in telecommunication systems. In this context, this paper proposes RENet, a time-frequency (T-F) domain method leveraging complex spectrum mapping to mitigate speech distortions. Specifically, the proposed RENet is a multi-stage network. First, TF-GridGAN was designed to recover the degraded speech with a generative adversarial network (GAN). Second, a full-band enhancement module was introduced to eliminate residual noises and artifacts existed in the output of TF-GridGAN. Finally, a lightweight bandwidth extension (BWE) network was implemented to further improve the speech quality by generating high-resolution speeches. Subjective results confirmed the competitive performance of the proposed method under various distortions, and the proposed method ranked the 2nd place in the non-real-time track of the ICASSP 2024 SSI Challenge.

up
1 user has voted: Fengyuan Hao