Sorry, you need to enable JavaScript to visit this website.

A Lightweight Hybrid Multi-Channel Speech Extraction System with Directional Voice Activity Detection

DOI:
10.60864/g661-5738
Citation Author(s):
Tianchi Sun, Tong Lei, Xu Zhang, Yuxiang Hu, Changbao Zhu, Jing Lu
Submitted by:
Tianchi Sun
Last updated:
15 April 2024 - 9:02pm
Document Type:
Presentation Slides
Document Year:
2024
Event:
Presenters:
Tianchi Sun
Paper Code:
AASP-L7.4
 

Although deep learning (DL) based end-to-end models have shown outstanding performance in multi-channel speech extraction, their practical applications on edge devices are restricted due to their high computational complexity. In this paper, we propose a hybrid system that can more effectively integrate the generalized sidelobe canceller (GSC) and a lightweight post-filtering model under the assistance of spatial speaker activity information provided by a directional voice activity detection (DVAD) module. In addition to guiding the update of the adaptive blocking matrix (ABM) and the adaptive interference canceller (AIC) used in GSC to alleviate the distortion of the desired speech, DVAD is also utilized as an auxiliary input to the postfiltering model to enhance its capability of interference suppression. The experimental results demonstrate that, with much lower computational costs, our method can achieve comparable performance with a current state-of-the-art end-to-end model on simulated data and generalize even better on real-world data.

up
0 users have voted: