Sorry, you need to enable JavaScript to visit this website.

All Neural Kronecker Product Beamforming for Speech Extraction with Large-scale Microphone Arrays

Citation Author(s):
Submitted by:
Jian Li
Last updated:
6 April 2024 - 10:01pm
Document Type:
Poster
 

Existing frame-wise neural beamformers for speech extraction tasks can obtain promising performance in relatively high signal-to-noise ratio (SNR) scenarios using small microphone arrays, while they still suffer from performance degradation in relatively low SNR environments, e.g., SNR<-5 dB. As an attempt to solve this problem, this paper proposes an all-neural beamformer based on Kronecker product decomposition, denoted by NeuKP-BF, for large-scale microphone arrays. The core idea is to incorporate the high spatial resolution of large microphone arrays and the powerful non-linear modeling capability of deep neural networks to improve speech extraction performance in challenging environments. In this paper, to reduce the feature representation redundancy and improve the interpretability, we used the Kronecker product rule to decompose the original large-scale array into two small virtual subarrays, and beamformers for the two subarrays were then designed and merged finally. The whole system was designed to implement in an end-to-end manner. Experiments were conducted on both the synthesized data using the DNS-Challenge corpus. The results showed that the proposed approach outperformed existing advanced baselines in terms of multiple objective metrics.

up
0 users have voted: