Sorry, you need to enable JavaScript to visit this website.

DEREVERBERATION AND BEAMFORMING IN FAR-FIELD SPEAKER RECOGNITION

Citation Author(s):
Ladislav Mosner, Pavel Matejka, Ondrej Novotny, Jan Cernocky
Submitted by:
Ladislav Mosner
Last updated:
13 April 2018 - 5:47am
Document Type:
Poster
Document Year:
2018
Event:
Presenters:
Ladislav Mosner
Paper Code:
3426
 

This paper deals with far-field speaker recognition. On a corpus of NIST SRE 2010 data retransmitted in a real room with multiple microphones, we first demonstrate how room acoustics cause significant degradation of state-of-the-art i-vector based speaker recognition system. We then investigate several techniques to improve the performances ranging from probabilistic linear discriminant analysis (PLDA) re-training, through dereverberation, to beamforming. We found that weighted prediction error (WPE) based dereverberation combined with generalized eigenvalue beamformer with power-spectral density (PSD) weighting masks generated by neural networks (NN) provides results approaching the clean close-microphone setup. Further improvement was obtained by re-training PLDA or the mask-generating NNs on simulated target data. The work shows that a speaker recognition system working robustly in the far-field scenario can be developed.

up
0 users have voted: