Sorry, you need to enable JavaScript to visit this website.

facebooktwittermailshare

Rapid Speaker Adaptation Based on D-code Extracted from BLSTM-RNN in LVCSR

Abstract: 

Recently, several fast speaker adaptation methods have been proposed for the hybrid DNN-HMM models based on the so called discriminative speaker codes (SC) and applied to unsupervised speaker adaptation in speech recognition. It has been demonstrated that the SC based methods are quite effective in adapting DNNs even when only a very small amount of adaptation data is available. However, in this way we have to estimate speaker code for new speakers by an updating process and obtain the final results through two-pass decoding. In this paper, we propose an alternative d-code extraction method to replace SC based on modeling speaker information with BLSTMRNN which makes one-pass decoding possible. After that, a speaker clustering approach is introduced to decrease the target number of speaker-BLSTM which accelerates training speed and improves ASR performance at the same time. Meanwhile, an interpolation method is provided for taking use of d-codes from training set to improve the recognition accuracy especially when adaptation data is limited. Experimental results on Switchboard task have shown that the proposed methods lead to a comparable relative reduction in WER (about 9%) as the standard SC based adaptation method without the need of two-pass decoding.

up
0 users have voted:

Paper Details

Authors:
Shaofei Xue, Zhijie Yan, Zhiying Huang, Lirong Dai
Submitted On:
14 October 2016 - 12:31pm
Short Link:
Type:
Presentation Slides
Event:
Presenter's Name:
Shaofei Xue
Paper Code:
O11-3
Document Year:
2016
Cite

Document Files

Rapid Speaker Adaptation Based on D-code Extracted from BLSTM-RNN in LVCSR.pdf

(72)

Subscribe

[1] Shaofei Xue, Zhijie Yan, Zhiying Huang, Lirong Dai, "Rapid Speaker Adaptation Based on D-code Extracted from BLSTM-RNN in LVCSR", IEEE SigPort, 2016. [Online]. Available: http://sigport.org/1208. Accessed: Aug. 03, 2020.
@article{1208-16,
url = {http://sigport.org/1208},
author = {Shaofei Xue; Zhijie Yan; Zhiying Huang; Lirong Dai },
publisher = {IEEE SigPort},
title = {Rapid Speaker Adaptation Based on D-code Extracted from BLSTM-RNN in LVCSR},
year = {2016} }
TY - EJOUR
T1 - Rapid Speaker Adaptation Based on D-code Extracted from BLSTM-RNN in LVCSR
AU - Shaofei Xue; Zhijie Yan; Zhiying Huang; Lirong Dai
PY - 2016
PB - IEEE SigPort
UR - http://sigport.org/1208
ER -
Shaofei Xue, Zhijie Yan, Zhiying Huang, Lirong Dai. (2016). Rapid Speaker Adaptation Based on D-code Extracted from BLSTM-RNN in LVCSR. IEEE SigPort. http://sigport.org/1208
Shaofei Xue, Zhijie Yan, Zhiying Huang, Lirong Dai, 2016. Rapid Speaker Adaptation Based on D-code Extracted from BLSTM-RNN in LVCSR. Available at: http://sigport.org/1208.
Shaofei Xue, Zhijie Yan, Zhiying Huang, Lirong Dai. (2016). "Rapid Speaker Adaptation Based on D-code Extracted from BLSTM-RNN in LVCSR." Web.
1. Shaofei Xue, Zhijie Yan, Zhiying Huang, Lirong Dai. Rapid Speaker Adaptation Based on D-code Extracted from BLSTM-RNN in LVCSR [Internet]. IEEE SigPort; 2016. Available from : http://sigport.org/1208