Sorry, you need to enable JavaScript to visit this website.

facebooktwittermailshare

FULL-INFO TRAINING FOR DEEP SPEAKER FEATURE LEARNING

Abstract: 

In recent studies, it has shown that speaker patterns can be learned from very short speech segments (e.g., 0.3 seconds) by a carefully designed convolutional & time-delay deep neural network (CT-DNN) model. By enforcing the model to discriminate the speakers in the training data, frame-level speaker features can be derived from the last hidden layer. In spite of its good performance, a potential problem of the present model is that it involves a parametric classifier, i.e., the last affine layer, which may consume some discriminative knowledge, thus leading to ‘information leak’ for the feature learning. This paper presents a full-info training approach that discards the parametric classifier and enforces all the discriminative knowledge learned by the feature net. Our experiments on the Fisher database demonstrate that this new training scheme can produce more coherent features, leading to consistent and notable performance improvement on the speaker verification task.

up
0 users have voted:

Paper Details

Authors:
Lantian Li, Zhiyuan Tang, Dong Wang, Thomas Fang Zheng
Submitted On:
20 April 2018 - 7:38am
Short Link:
Type:
Poster
Event:
Presenter's Name:
Miao Zhang
Paper Code:
3967
Document Year:
2018
Cite

Document Files

180418-Full_info-LLT.pptx

(20 downloads)

Subscribe

[1] Lantian Li, Zhiyuan Tang, Dong Wang, Thomas Fang Zheng, "FULL-INFO TRAINING FOR DEEP SPEAKER FEATURE LEARNING", IEEE SigPort, 2018. [Online]. Available: http://sigport.org/3100. Accessed: May. 21, 2018.
@article{3100-18,
url = {http://sigport.org/3100},
author = {Lantian Li; Zhiyuan Tang; Dong Wang; Thomas Fang Zheng },
publisher = {IEEE SigPort},
title = {FULL-INFO TRAINING FOR DEEP SPEAKER FEATURE LEARNING},
year = {2018} }
TY - EJOUR
T1 - FULL-INFO TRAINING FOR DEEP SPEAKER FEATURE LEARNING
AU - Lantian Li; Zhiyuan Tang; Dong Wang; Thomas Fang Zheng
PY - 2018
PB - IEEE SigPort
UR - http://sigport.org/3100
ER -
Lantian Li, Zhiyuan Tang, Dong Wang, Thomas Fang Zheng. (2018). FULL-INFO TRAINING FOR DEEP SPEAKER FEATURE LEARNING. IEEE SigPort. http://sigport.org/3100
Lantian Li, Zhiyuan Tang, Dong Wang, Thomas Fang Zheng, 2018. FULL-INFO TRAINING FOR DEEP SPEAKER FEATURE LEARNING. Available at: http://sigport.org/3100.
Lantian Li, Zhiyuan Tang, Dong Wang, Thomas Fang Zheng. (2018). "FULL-INFO TRAINING FOR DEEP SPEAKER FEATURE LEARNING." Web.
1. Lantian Li, Zhiyuan Tang, Dong Wang, Thomas Fang Zheng. FULL-INFO TRAINING FOR DEEP SPEAKER FEATURE LEARNING [Internet]. IEEE SigPort; 2018. Available from : http://sigport.org/3100