Documents
Poster
DEREVERBERATION AND BEAMFORMING IN FAR-FIELD SPEAKER RECOGNITION
- Citation Author(s):
- Submitted by:
- Ladislav Mosner
- Last updated:
- 13 April 2018 - 5:47am
- Document Type:
- Poster
- Document Year:
- 2018
- Event:
- Presenters:
- Ladislav Mosner
- Paper Code:
- 3426
- Categories:
- Log in to post comments
This paper deals with far-field speaker recognition. On a corpus of NIST SRE 2010 data retransmitted in a real room with multiple microphones, we first demonstrate how room acoustics cause significant degradation of state-of-the-art i-vector based speaker recognition system. We then investigate several techniques to improve the performances ranging from probabilistic linear discriminant analysis (PLDA) re-training, through dereverberation, to beamforming. We found that weighted prediction error (WPE) based dereverberation combined with generalized eigenvalue beamformer with power-spectral density (PSD) weighting masks generated by neural networks (NN) provides results approaching the clean close-microphone setup. Further improvement was obtained by re-training PLDA or the mask-generating NNs on simulated target data. The work shows that a speaker recognition system working robustly in the far-field scenario can be developed.