Sorry, you need to enable JavaScript to visit this website.

facebooktwittermailshare

Unsupervised Speaker Adaptation of BLSTM-RNN for LVCSR Based on Speaker Code

Abstract: 

Recently, the speaker code based adaptation has been successfully expanded to recurrent neural networks using bidirectional Long Short-Term Memory (BLSTM-RNN) [1]. Experiments on the small-scale TIMIT task have demonstrated that the speaker code based adaptation is also valid for BLSTM-RNN. In this paper, we evaluate this method on large-scale task and introduce an error normalization method to balance the back-propagation errors derived from different layers for speaker codes. Meanwhile, we use singular value decomposition (SVD) method to conduct model compression. Results show that the speaker code based adaptation with SVD shows better recognition performance than the i-vector based speaker adaptation of the same dimension. Experimental results on Switchboard task show that the speaker code based adaptation on the hybrid BLSTM-DNN topology can achieve more than 9% relative reduction in word error rate (WER) compared to the speaker independent (SI) baseline.

up
0 users have voted:

Paper Details

Authors:
Zhiying Huang, Shaofei Xue, Zhijie Yan, Lirong Dai
Submitted On:
14 October 2016 - 10:15am
Short Link:
Type:
Presentation Slides
Event:
Presenter's Name:
Zhiying Huang
Paper Code:
ISCLSP-O11-1
Document Year:
2016
Cite

Document Files

ISCSLP_presentation_ZhiyingHuang_upload.pdf

(371)

Subscribe

[1] Zhiying Huang, Shaofei Xue, Zhijie Yan, Lirong Dai, "Unsupervised Speaker Adaptation of BLSTM-RNN for LVCSR Based on Speaker Code", IEEE SigPort, 2016. [Online]. Available: http://sigport.org/1198. Accessed: Aug. 19, 2019.
@article{1198-16,
url = {http://sigport.org/1198},
author = {Zhiying Huang; Shaofei Xue; Zhijie Yan; Lirong Dai },
publisher = {IEEE SigPort},
title = {Unsupervised Speaker Adaptation of BLSTM-RNN for LVCSR Based on Speaker Code},
year = {2016} }
TY - EJOUR
T1 - Unsupervised Speaker Adaptation of BLSTM-RNN for LVCSR Based on Speaker Code
AU - Zhiying Huang; Shaofei Xue; Zhijie Yan; Lirong Dai
PY - 2016
PB - IEEE SigPort
UR - http://sigport.org/1198
ER -
Zhiying Huang, Shaofei Xue, Zhijie Yan, Lirong Dai. (2016). Unsupervised Speaker Adaptation of BLSTM-RNN for LVCSR Based on Speaker Code. IEEE SigPort. http://sigport.org/1198
Zhiying Huang, Shaofei Xue, Zhijie Yan, Lirong Dai, 2016. Unsupervised Speaker Adaptation of BLSTM-RNN for LVCSR Based on Speaker Code. Available at: http://sigport.org/1198.
Zhiying Huang, Shaofei Xue, Zhijie Yan, Lirong Dai. (2016). "Unsupervised Speaker Adaptation of BLSTM-RNN for LVCSR Based on Speaker Code." Web.
1. Zhiying Huang, Shaofei Xue, Zhijie Yan, Lirong Dai. Unsupervised Speaker Adaptation of BLSTM-RNN for LVCSR Based on Speaker Code [Internet]. IEEE SigPort; 2016. Available from : http://sigport.org/1198