Sorry, you need to enable JavaScript to visit this website.

facebooktwittermailshare

Adversarial Speaker Adaptation

Abstract: 

We propose a novel adversarial speaker adaptation (ASA) scheme, in which adversarial learning is applied to regularize the distribution of deep hidden features in a speaker-dependent (SD) deep neural network (DNN) acoustic model to be close to that of a fixed speaker-independent (SI) DNN acoustic model during adaptation. An additional discriminator network is introduced to distinguish the deep features generated by the SD model from those produced by the SI model. In ASA, with a fixed SI model as the reference, an SD model is jointly optimized with the discriminator network to minimize the senone classification loss, and simultaneously to mini-maximize the SI/SD discrimination loss on the adaptation data. With ASA, a senone-discriminative deep feature is learned in the SD model with a similar distribution to that of the SI model. With such a regularized and adapted deep feature, the SD model can perform improved automatic speech recognition on the target speaker's speech. Evaluated on the Microsoft short message dictation dataset, ASA achieves 14.4% and 7.9% relative word error rate improvements for supervised and unsupervised adaptation, respectively, over an SI model trained from 2600 hours data, with 200 adaptation utterances per speaker.

up
0 users have voted:

Paper Details

Authors:
Zhong Meng, Jinyu Li, Yifan Gong
Submitted On:
12 May 2019 - 9:26pm
Short Link:
Type:
Presentation Slides
Event:
Presenter's Name:
Yifan Gong
Paper Code:
3792
Document Year:
2019
Cite

Document Files

asa_oral_v3.pptx

(3)

Subscribe

[1] Zhong Meng, Jinyu Li, Yifan Gong, "Adversarial Speaker Adaptation", IEEE SigPort, 2019. [Online]. Available: http://sigport.org/4475. Accessed: May. 23, 2019.
@article{4475-19,
url = {http://sigport.org/4475},
author = {Zhong Meng; Jinyu Li; Yifan Gong },
publisher = {IEEE SigPort},
title = {Adversarial Speaker Adaptation},
year = {2019} }
TY - EJOUR
T1 - Adversarial Speaker Adaptation
AU - Zhong Meng; Jinyu Li; Yifan Gong
PY - 2019
PB - IEEE SigPort
UR - http://sigport.org/4475
ER -
Zhong Meng, Jinyu Li, Yifan Gong. (2019). Adversarial Speaker Adaptation. IEEE SigPort. http://sigport.org/4475
Zhong Meng, Jinyu Li, Yifan Gong, 2019. Adversarial Speaker Adaptation. Available at: http://sigport.org/4475.
Zhong Meng, Jinyu Li, Yifan Gong. (2019). "Adversarial Speaker Adaptation." Web.
1. Zhong Meng, Jinyu Li, Yifan Gong. Adversarial Speaker Adaptation [Internet]. IEEE SigPort; 2019. Available from : http://sigport.org/4475