Documents
Poster
FSPEN: An Ultra-Lightweight Network for Real Time Speech Enhancement
- DOI:
- 10.60864/wpd7-j135
- Citation Author(s):
- Submitted by:
- Yang Lei
- Last updated:
- 6 June 2024 - 10:28am
- Document Type:
- Poster
- Document Year:
- 2024
- Event:
- Presenters:
- Lei Yang
- Paper Code:
- SLP-P3.6
- Categories:
- Log in to post comments
Deep learning-based speech enhancement methods have shown promising result in recent years. However, in practical applications, the model size and computational complexity are important factors that limit their use in end-products. Therefore, in products that require real-time speech enhancement with limited resources, such as TWS headsets, hearing aids, IoT devices, etc., ultra-lightweight models are necessary. In this paper, an ultra-lightweight network FSPEN is proposed for real-time speech enhancement task. We propose a full-band and sub-band network structure for extracting global and local features, and an inter-frame path extension method that can enhance network modeling capacity while preserving complexity. Experiments demonstrate that the proposed FSPEN achieves a performance of PESQ 2.97 on the VoiceBank+Demand dataset at 89M multiply-accumulate operation per second (MAC) and 79k parameters.