Documents
Presentation Slides
ICASSP2020 TEXT-INDEPENDENT SPEAKER VERIFICATION WITH ADVERSARIAL LEARNING ON SHORT UTTERANCES
- Citation Author(s):
- Submitted by:
- KAI LIU
- Last updated:
- 13 May 2020 - 9:57pm
- Document Type:
- Presentation Slides
- Document Year:
- 2020
- Event:
- Presenters:
- KAI LIU
- Paper Code:
- 10.1109/ICASSP40776.2020.9054036
- Categories:
- Keywords:
- Log in to post comments
A text-independent speaker verification system suffers severe performance degradation under short utterance condition. To address the problem, in this paper, we propose an adversarially learned embedding mapping model that directly maps a short embedding to an enhanced embedding with increased discriminability. In particular, a Wasserstein GAN with a bunch of loss criteria are investigated. These loss functions have distinct optimization objectives and some of them are less favoured for the speaker verification research area. Different from most prior studies, our main objective in this study is to investigate the effectiveness of those loss criteria by conducting numerous ablation studies. Experiments on Voxceleb dataset showed that some criteria are beneficial to the verification performance while some have trivial effects. Lastly, a Wasserstein GAN with chosen loss criteria, without finetuning, achieves meaningful advancements over the baseline, with 4% relative improvements on EER and 7% on minDCF in the challenging scenario of short 2second utterances.