Sorry, you need to enable JavaScript to visit this website.

facebooktwittermailshare

NEURAL NETWORK LANGUAGE MODELING WITH LETTER-BASED FEATURES AND IMPORTANCE SAMPLING

Abstract: 

In this paper we describe an extension of the Kaldi software toolkit to support neural-based language modeling, intended for use in automatic speech recognition (ASR) and related tasks. We combine the use of subword features (letter ngrams) and one-hot encoding of frequent words so that the models can handle large vocabularies containing infrequent words. We propose a new objective function that allows for training of unnormalized probabilities. An importance sampling based method is supported to speed up training when the vocabulary is large. Experimental results on five corpora show that Kaldi-RNNLM rivals other recurrent neural network language model toolkits both on performance and training speed.

up
0 users have voted:

Paper Details

Authors:
Hainan Xu, Ke Li, Yiming Wang, Jian Wang, Shiyin Kang, Xie Chen, Daniel Povey, Sanjeev Khudanpur
Submitted On:
19 April 2018 - 11:52pm
Short Link:
Type:
Poster
Event:
Presenter's Name:
Hainan Xu
Paper Code:
3403
Document Year:
2018
Cite

Document Files

kaldi-rnnlm-poster.pdf

(71 downloads)

Subscribe

[1] Hainan Xu, Ke Li, Yiming Wang, Jian Wang, Shiyin Kang, Xie Chen, Daniel Povey, Sanjeev Khudanpur, "NEURAL NETWORK LANGUAGE MODELING WITH LETTER-BASED FEATURES AND IMPORTANCE SAMPLING", IEEE SigPort, 2018. [Online]. Available: http://sigport.org/3064. Accessed: Oct. 16, 2018.
@article{3064-18,
url = {http://sigport.org/3064},
author = {Hainan Xu; Ke Li; Yiming Wang; Jian Wang; Shiyin Kang; Xie Chen; Daniel Povey; Sanjeev Khudanpur },
publisher = {IEEE SigPort},
title = {NEURAL NETWORK LANGUAGE MODELING WITH LETTER-BASED FEATURES AND IMPORTANCE SAMPLING},
year = {2018} }
TY - EJOUR
T1 - NEURAL NETWORK LANGUAGE MODELING WITH LETTER-BASED FEATURES AND IMPORTANCE SAMPLING
AU - Hainan Xu; Ke Li; Yiming Wang; Jian Wang; Shiyin Kang; Xie Chen; Daniel Povey; Sanjeev Khudanpur
PY - 2018
PB - IEEE SigPort
UR - http://sigport.org/3064
ER -
Hainan Xu, Ke Li, Yiming Wang, Jian Wang, Shiyin Kang, Xie Chen, Daniel Povey, Sanjeev Khudanpur. (2018). NEURAL NETWORK LANGUAGE MODELING WITH LETTER-BASED FEATURES AND IMPORTANCE SAMPLING. IEEE SigPort. http://sigport.org/3064
Hainan Xu, Ke Li, Yiming Wang, Jian Wang, Shiyin Kang, Xie Chen, Daniel Povey, Sanjeev Khudanpur, 2018. NEURAL NETWORK LANGUAGE MODELING WITH LETTER-BASED FEATURES AND IMPORTANCE SAMPLING. Available at: http://sigport.org/3064.
Hainan Xu, Ke Li, Yiming Wang, Jian Wang, Shiyin Kang, Xie Chen, Daniel Povey, Sanjeev Khudanpur. (2018). "NEURAL NETWORK LANGUAGE MODELING WITH LETTER-BASED FEATURES AND IMPORTANCE SAMPLING." Web.
1. Hainan Xu, Ke Li, Yiming Wang, Jian Wang, Shiyin Kang, Xie Chen, Daniel Povey, Sanjeev Khudanpur. NEURAL NETWORK LANGUAGE MODELING WITH LETTER-BASED FEATURES AND IMPORTANCE SAMPLING [Internet]. IEEE SigPort; 2018. Available from : http://sigport.org/3064