Sorry, you need to enable JavaScript to visit this website.

FSPEN: An Ultra-Lightweight Network for Real Time Speech Enhancement

DOI:
10.60864/wpd7-j135
Citation Author(s):
Submitted by:
Yang Lei
Last updated:
6 June 2024 - 10:28am
Document Type:
Poster
Document Year:
2024
Event:
Presenters:
Lei Yang
Paper Code:
SLP-P3.6
 

Deep learning-based speech enhancement methods have shown promising result in recent years. However, in practical applications, the model size and computational complexity are important factors that limit their use in end-products. Therefore, in products that require real-time speech enhancement with limited resources, such as TWS headsets, hearing aids, IoT devices, etc., ultra-lightweight models are necessary. In this paper, an ultra-lightweight network FSPEN is proposed for real-time speech enhancement task. We propose a full-band and sub-band network structure for extracting global and local features, and an inter-frame path extension method that can enhance network modeling capacity while preserving complexity. Experiments demonstrate that the proposed FSPEN achieves a performance of PESQ 2.97 on the VoiceBank+Demand dataset at 89M multiply-accumulate operation per second (MAC) and 79k parameters.

up
0 users have voted: