Documents
Poster
Adversarial Speaker Verification
- Citation Author(s):
- Submitted by:
- Zhong Meng
- Last updated:
- 12 May 2019 - 9:24pm
- Document Type:
- Poster
- Document Year:
- 2019
- Event:
- Presenters:
- Yong Zhao
- Paper Code:
- 3792
- Categories:
- Log in to post comments
The use of deep networks to extract embeddings for speaker recognition has proven successfully. However, such embeddings are susceptible to performance degradation due to the mismatches among the training, enrollment, and test conditions. In this work, we propose an adversarial speaker verification (ASV) scheme to learn the condition-invariant deep embedding via adversarial multi-task training. In ASV, a speaker classification network and a condition identification network are jointly optimized to minimize the speaker classification loss and simultaneously mini-maximize the condition loss. The target labels of the condition network can be categorical (environment types) and continuous (SNR values). We further propose multi-factorial ASV to simultaneously suppress multiple factors that constitute the condition variability. Evaluated on a Microsoft Cortana text-dependent speaker verification task, the ASV achieves 8.8% and 14.5% relative improvements in equal error rates (EER) for known and unknown conditions, respectively.