Sorry, you need to enable JavaScript to visit this website.

Language Modeling, for Speech and SLP (SLP-LANG)

Dialog Context Language Modeling with Recurrent Neural Networks


We propose contextual language models that incorporate dialog level discourse information into language modeling. Previous works on contextual language model treat preceding utterances as a sequence of inputs, without considering dialog interactions. We design recurrent neural network (RNN) based contextual language models that specially track the interactions between speakers in a dialog. Experiment results on Switchboard Dialog Act Corpus show that the proposed model outperforms conventional single turn based RNN language model by 3.3% on perplexity.

Paper Details

Authors:
Bing Liu, Ian Lane
Submitted On:
9 March 2017 - 4:59pm
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

Dialog Context Language Modeling with Recurrent Neural Networks

(52 downloads)

Keywords

Subscribe

[1] Bing Liu, Ian Lane, "Dialog Context Language Modeling with Recurrent Neural Networks", IEEE SigPort, 2017. [Online]. Available: http://sigport.org/1730. Accessed: Aug. 24, 2017.
@article{1730-17,
url = {http://sigport.org/1730},
author = {Bing Liu; Ian Lane },
publisher = {IEEE SigPort},
title = {Dialog Context Language Modeling with Recurrent Neural Networks},
year = {2017} }
TY - EJOUR
T1 - Dialog Context Language Modeling with Recurrent Neural Networks
AU - Bing Liu; Ian Lane
PY - 2017
PB - IEEE SigPort
UR - http://sigport.org/1730
ER -
Bing Liu, Ian Lane. (2017). Dialog Context Language Modeling with Recurrent Neural Networks. IEEE SigPort. http://sigport.org/1730
Bing Liu, Ian Lane, 2017. Dialog Context Language Modeling with Recurrent Neural Networks. Available at: http://sigport.org/1730.
Bing Liu, Ian Lane. (2017). "Dialog Context Language Modeling with Recurrent Neural Networks." Web.
1. Bing Liu, Ian Lane. Dialog Context Language Modeling with Recurrent Neural Networks [Internet]. IEEE SigPort; 2017. Available from : http://sigport.org/1730

CHARACTER-LEVEL LANGUAGE MODELING WITH HIERARCHICAL RECURRENT NEURAL NETWORKS


Recurrent neural network (RNN) based character-level language models (CLMs) are extremely useful for modeling out-of-vocabulary words by nature. However, their performance is generally much worse than the word-level language models (WLMs), since CLMs need to consider longer history of tokens to properly predict the next one. We address this problem by proposing hierarchical RNN architectures, which consist of multiple modules with different timescales.

poster.pdf

PDF icon poster.pdf (247 downloads)

Paper Details

Authors:
Kyuyeon Hwang, Wonyong Sung
Submitted On:
6 March 2017 - 3:05am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

poster.pdf

(247 downloads)

Keywords

Subscribe

[1] Kyuyeon Hwang, Wonyong Sung, "CHARACTER-LEVEL LANGUAGE MODELING WITH HIERARCHICAL RECURRENT NEURAL NETWORKS", IEEE SigPort, 2017. [Online]. Available: http://sigport.org/1645. Accessed: Aug. 24, 2017.
@article{1645-17,
url = {http://sigport.org/1645},
author = {Kyuyeon Hwang; Wonyong Sung },
publisher = {IEEE SigPort},
title = {CHARACTER-LEVEL LANGUAGE MODELING WITH HIERARCHICAL RECURRENT NEURAL NETWORKS},
year = {2017} }
TY - EJOUR
T1 - CHARACTER-LEVEL LANGUAGE MODELING WITH HIERARCHICAL RECURRENT NEURAL NETWORKS
AU - Kyuyeon Hwang; Wonyong Sung
PY - 2017
PB - IEEE SigPort
UR - http://sigport.org/1645
ER -
Kyuyeon Hwang, Wonyong Sung. (2017). CHARACTER-LEVEL LANGUAGE MODELING WITH HIERARCHICAL RECURRENT NEURAL NETWORKS. IEEE SigPort. http://sigport.org/1645
Kyuyeon Hwang, Wonyong Sung, 2017. CHARACTER-LEVEL LANGUAGE MODELING WITH HIERARCHICAL RECURRENT NEURAL NETWORKS. Available at: http://sigport.org/1645.
Kyuyeon Hwang, Wonyong Sung. (2017). "CHARACTER-LEVEL LANGUAGE MODELING WITH HIERARCHICAL RECURRENT NEURAL NETWORKS." Web.
1. Kyuyeon Hwang, Wonyong Sung. CHARACTER-LEVEL LANGUAGE MODELING WITH HIERARCHICAL RECURRENT NEURAL NETWORKS [Internet]. IEEE SigPort; 2017. Available from : http://sigport.org/1645

Directed Automatic Speech Transcription Error Correction Using Bidirectional LSTM


In automatic speech recognition (ASR), error correction after the initial search stage is a commonly used technique to improve performance. Whilst completely automatic error correction, such as full second pass rescoring using complex language models, is widely used, directed error correction, where the error locations are manually given, is of great interest in many scenarios. Previous works on directed error correction usually uses the error location information to change search space with original ASR models.

poster.pdf

PDF icon poster.pdf (523 downloads)

Paper Details

Authors:
Da Zheng, Zhehuai Chen, Yue Wu, Kai Yu
Submitted On:
18 October 2016 - 1:03pm
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

poster.pdf

(523 downloads)

Keywords

Subscribe

[1] Da Zheng, Zhehuai Chen, Yue Wu, Kai Yu, "Directed Automatic Speech Transcription Error Correction Using Bidirectional LSTM", IEEE SigPort, 2016. [Online]. Available: http://sigport.org/1259. Accessed: Aug. 24, 2017.
@article{1259-16,
url = {http://sigport.org/1259},
author = {Da Zheng; Zhehuai Chen; Yue Wu; Kai Yu },
publisher = {IEEE SigPort},
title = {Directed Automatic Speech Transcription Error Correction Using Bidirectional LSTM},
year = {2016} }
TY - EJOUR
T1 - Directed Automatic Speech Transcription Error Correction Using Bidirectional LSTM
AU - Da Zheng; Zhehuai Chen; Yue Wu; Kai Yu
PY - 2016
PB - IEEE SigPort
UR - http://sigport.org/1259
ER -
Da Zheng, Zhehuai Chen, Yue Wu, Kai Yu. (2016). Directed Automatic Speech Transcription Error Correction Using Bidirectional LSTM. IEEE SigPort. http://sigport.org/1259
Da Zheng, Zhehuai Chen, Yue Wu, Kai Yu, 2016. Directed Automatic Speech Transcription Error Correction Using Bidirectional LSTM. Available at: http://sigport.org/1259.
Da Zheng, Zhehuai Chen, Yue Wu, Kai Yu. (2016). "Directed Automatic Speech Transcription Error Correction Using Bidirectional LSTM." Web.
1. Da Zheng, Zhehuai Chen, Yue Wu, Kai Yu. Directed Automatic Speech Transcription Error Correction Using Bidirectional LSTM [Internet]. IEEE SigPort; 2016. Available from : http://sigport.org/1259

Exploiting noisy web data by OOV ranking for low-resource keyword search


Spoken keyword search in low-resource condition suffers from out-of-vocabulary (OOV) problem and insufficient text data for language model (LM) training. Web-crawled text data is used to expand vocabulary and to augment language model. However, the mismatching between web text and the target speech data brings difficulties to effective utilization. New words from web data need an evaluation to exclude noisy words or introduce proper probabilities. In this paper, several criteria to rank new words from web data are investigated and are used as features

Paper Details

Authors:
Ji Wu
Submitted On:
15 October 2016 - 7:55am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

ISCSLP2016_Poster_Exploiting.pdf

(108 downloads)

Keywords

Subscribe

[1] Ji Wu, "Exploiting noisy web data by OOV ranking for low-resource keyword search", IEEE SigPort, 2016. [Online]. Available: http://sigport.org/1232. Accessed: Aug. 24, 2017.
@article{1232-16,
url = {http://sigport.org/1232},
author = {Ji Wu },
publisher = {IEEE SigPort},
title = {Exploiting noisy web data by OOV ranking for low-resource keyword search},
year = {2016} }
TY - EJOUR
T1 - Exploiting noisy web data by OOV ranking for low-resource keyword search
AU - Ji Wu
PY - 2016
PB - IEEE SigPort
UR - http://sigport.org/1232
ER -
Ji Wu. (2016). Exploiting noisy web data by OOV ranking for low-resource keyword search. IEEE SigPort. http://sigport.org/1232
Ji Wu, 2016. Exploiting noisy web data by OOV ranking for low-resource keyword search. Available at: http://sigport.org/1232.
Ji Wu. (2016). "Exploiting noisy web data by OOV ranking for low-resource keyword search." Web.
1. Ji Wu. Exploiting noisy web data by OOV ranking for low-resource keyword search [Internet]. IEEE SigPort; 2016. Available from : http://sigport.org/1232

Learning FOFE based FNN-LMs with noise contrastive estimation and part-of-speech features


A simple but powerful language model called fixed-size
ordinally-forgetting encoding (FOFE) based feedforward neural
network language models (FNN-LMs) has been proposed recently.
Experimental results have shown that FOFE based FNNLMs
can outperform not only the standard FNN-LMs but also
the popular recurrent neural network language models (RNNLMs).
In this paper, we extend FOFE based FNN-LMs from
several aspects. Firstly, we have proposed a new method to
further improve the performance of FOFE based FNN-LMs by

Paper Details

Authors:
Junfeng Hou,Shiliang Zhang,Lirong Dai
Submitted On:
14 October 2016 - 4:49am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

iscslp2016_poster_jfhou_.pdf

(101 downloads)

Keywords

Subscribe

[1] Junfeng Hou,Shiliang Zhang,Lirong Dai, "Learning FOFE based FNN-LMs with noise contrastive estimation and part-of-speech features", IEEE SigPort, 2016. [Online]. Available: http://sigport.org/1185. Accessed: Aug. 24, 2017.
@article{1185-16,
url = {http://sigport.org/1185},
author = {Junfeng Hou;Shiliang Zhang;Lirong Dai },
publisher = {IEEE SigPort},
title = {Learning FOFE based FNN-LMs with noise contrastive estimation and part-of-speech features},
year = {2016} }
TY - EJOUR
T1 - Learning FOFE based FNN-LMs with noise contrastive estimation and part-of-speech features
AU - Junfeng Hou;Shiliang Zhang;Lirong Dai
PY - 2016
PB - IEEE SigPort
UR - http://sigport.org/1185
ER -
Junfeng Hou,Shiliang Zhang,Lirong Dai. (2016). Learning FOFE based FNN-LMs with noise contrastive estimation and part-of-speech features. IEEE SigPort. http://sigport.org/1185
Junfeng Hou,Shiliang Zhang,Lirong Dai, 2016. Learning FOFE based FNN-LMs with noise contrastive estimation and part-of-speech features. Available at: http://sigport.org/1185.
Junfeng Hou,Shiliang Zhang,Lirong Dai. (2016). "Learning FOFE based FNN-LMs with noise contrastive estimation and part-of-speech features." Web.
1. Junfeng Hou,Shiliang Zhang,Lirong Dai. Learning FOFE based FNN-LMs with noise contrastive estimation and part-of-speech features [Internet]. IEEE SigPort; 2016. Available from : http://sigport.org/1185

Language Model Adaptation for ASR of Spoken Translations Using Phrase-based Translation Models and Named Entity Models

Paper Details

Authors:
Joris Pelemans, Tom Vanallemeersch, Kris Demuynck, Lyan Verwimp, Hugo Van hamme, Patrick Wambacq
Submitted On:
4 April 2016 - 10:11am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

mtlm2.pdf

(168 downloads)

Keywords

Subscribe

[1] Joris Pelemans, Tom Vanallemeersch, Kris Demuynck, Lyan Verwimp, Hugo Van hamme, Patrick Wambacq, "Language Model Adaptation for ASR of Spoken Translations Using Phrase-based Translation Models and Named Entity Models", IEEE SigPort, 2016. [Online]. Available: http://sigport.org/1083. Accessed: Aug. 24, 2017.
@article{1083-16,
url = {http://sigport.org/1083},
author = {Joris Pelemans; Tom Vanallemeersch; Kris Demuynck; Lyan Verwimp; Hugo Van hamme; Patrick Wambacq },
publisher = {IEEE SigPort},
title = {Language Model Adaptation for ASR of Spoken Translations Using Phrase-based Translation Models and Named Entity Models},
year = {2016} }
TY - EJOUR
T1 - Language Model Adaptation for ASR of Spoken Translations Using Phrase-based Translation Models and Named Entity Models
AU - Joris Pelemans; Tom Vanallemeersch; Kris Demuynck; Lyan Verwimp; Hugo Van hamme; Patrick Wambacq
PY - 2016
PB - IEEE SigPort
UR - http://sigport.org/1083
ER -
Joris Pelemans, Tom Vanallemeersch, Kris Demuynck, Lyan Verwimp, Hugo Van hamme, Patrick Wambacq. (2016). Language Model Adaptation for ASR of Spoken Translations Using Phrase-based Translation Models and Named Entity Models. IEEE SigPort. http://sigport.org/1083
Joris Pelemans, Tom Vanallemeersch, Kris Demuynck, Lyan Verwimp, Hugo Van hamme, Patrick Wambacq, 2016. Language Model Adaptation for ASR of Spoken Translations Using Phrase-based Translation Models and Named Entity Models. Available at: http://sigport.org/1083.
Joris Pelemans, Tom Vanallemeersch, Kris Demuynck, Lyan Verwimp, Hugo Van hamme, Patrick Wambacq. (2016). "Language Model Adaptation for ASR of Spoken Translations Using Phrase-based Translation Models and Named Entity Models." Web.
1. Joris Pelemans, Tom Vanallemeersch, Kris Demuynck, Lyan Verwimp, Hugo Van hamme, Patrick Wambacq. Language Model Adaptation for ASR of Spoken Translations Using Phrase-based Translation Models and Named Entity Models [Internet]. IEEE SigPort; 2016. Available from : http://sigport.org/1083

CUED-RNNLM – An Open-Source Toolkit for Efficient Training and Evaluation of Recurrent Neural Network Language Models


In recent years, recurrent neural network language models (RNNLMs) have become increasingly popular for a range of applications including speech recognition. However, the training of RNNLMs is computationally expensive, which limits the quantity of data, and size of network, that can be used. In order to fully exploit the power of RNNLMs, efficient training implementations are required. This paper introduces an open-source toolkit, the CUED-RNNLM toolkit, which supports efficient GPU-based training of RNNLMs.

slides.pdf

PDF icon slides.pdf (177 downloads)

Paper Details

Authors:
Xie Chen, Yanmin Qian, Xunying Liu, Mark Gales, Phil Woodland
Submitted On:
1 April 2016 - 6:35am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

slides.pdf

(177 downloads)

Keywords

Subscribe

[1] Xie Chen, Yanmin Qian, Xunying Liu, Mark Gales, Phil Woodland, "CUED-RNNLM – An Open-Source Toolkit for Efficient Training and Evaluation of Recurrent Neural Network Language Models", IEEE SigPort, 2016. [Online]. Available: http://sigport.org/1065. Accessed: Aug. 24, 2017.
@article{1065-16,
url = {http://sigport.org/1065},
author = {Xie Chen; Yanmin Qian; Xunying Liu; Mark Gales; Phil Woodland },
publisher = {IEEE SigPort},
title = {CUED-RNNLM – An Open-Source Toolkit for Efficient Training and Evaluation of Recurrent Neural Network Language Models},
year = {2016} }
TY - EJOUR
T1 - CUED-RNNLM – An Open-Source Toolkit for Efficient Training and Evaluation of Recurrent Neural Network Language Models
AU - Xie Chen; Yanmin Qian; Xunying Liu; Mark Gales; Phil Woodland
PY - 2016
PB - IEEE SigPort
UR - http://sigport.org/1065
ER -
Xie Chen, Yanmin Qian, Xunying Liu, Mark Gales, Phil Woodland. (2016). CUED-RNNLM – An Open-Source Toolkit for Efficient Training and Evaluation of Recurrent Neural Network Language Models. IEEE SigPort. http://sigport.org/1065
Xie Chen, Yanmin Qian, Xunying Liu, Mark Gales, Phil Woodland, 2016. CUED-RNNLM – An Open-Source Toolkit for Efficient Training and Evaluation of Recurrent Neural Network Language Models. Available at: http://sigport.org/1065.
Xie Chen, Yanmin Qian, Xunying Liu, Mark Gales, Phil Woodland. (2016). "CUED-RNNLM – An Open-Source Toolkit for Efficient Training and Evaluation of Recurrent Neural Network Language Models." Web.
1. Xie Chen, Yanmin Qian, Xunying Liu, Mark Gales, Phil Woodland. CUED-RNNLM – An Open-Source Toolkit for Efficient Training and Evaluation of Recurrent Neural Network Language Models [Internet]. IEEE SigPort; 2016. Available from : http://sigport.org/1065

Semantic Word Embedding Neural Network Language Models for Automatic Speech Recognition

Paper Details

Authors:
Abhinav Sethy, Bhuvana Ramabhadran
Submitted On:
23 March 2016 - 6:16pm
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

gloveNNLM_kaudhkhasi_icassp2016.pdf

(158 downloads)

Keywords

Subscribe

[1] Abhinav Sethy, Bhuvana Ramabhadran, "Semantic Word Embedding Neural Network Language Models for Automatic Speech Recognition", IEEE SigPort, 2016. [Online]. Available: http://sigport.org/1002. Accessed: Aug. 24, 2017.
@article{1002-16,
url = {http://sigport.org/1002},
author = {Abhinav Sethy; Bhuvana Ramabhadran },
publisher = {IEEE SigPort},
title = {Semantic Word Embedding Neural Network Language Models for Automatic Speech Recognition},
year = {2016} }
TY - EJOUR
T1 - Semantic Word Embedding Neural Network Language Models for Automatic Speech Recognition
AU - Abhinav Sethy; Bhuvana Ramabhadran
PY - 2016
PB - IEEE SigPort
UR - http://sigport.org/1002
ER -
Abhinav Sethy, Bhuvana Ramabhadran. (2016). Semantic Word Embedding Neural Network Language Models for Automatic Speech Recognition. IEEE SigPort. http://sigport.org/1002
Abhinav Sethy, Bhuvana Ramabhadran, 2016. Semantic Word Embedding Neural Network Language Models for Automatic Speech Recognition. Available at: http://sigport.org/1002.
Abhinav Sethy, Bhuvana Ramabhadran. (2016). "Semantic Word Embedding Neural Network Language Models for Automatic Speech Recognition." Web.
1. Abhinav Sethy, Bhuvana Ramabhadran. Semantic Word Embedding Neural Network Language Models for Automatic Speech Recognition [Internet]. IEEE SigPort; 2016. Available from : http://sigport.org/1002

A HIERARCHICAL FRAMEWORK FOR LANGUAGE IDENTIFICATION


HLID

Most current language recognition systems model different levels of information such as acoustic, prosodic, phonotactic, etc. independently and combine the model likelihoods in order to make a decision. However, these are single level systems that treat all languages identically and hence incapable of exploiting any similarities that may exist within groups of languages.

Paper Details

Authors:
Vidhyasaharan Sethu, Haris Bavattichalil, Eliathamby Ambikairajah, Haizhou Li
Submitted On:
14 March 2016 - 1:21am
Short Link:
Type:
Event:
Presenter's Name:
Document Year:
Cite

Document Files

ICASSP_Poster_v0.2.pdf

(142 downloads)

Keywords

Subscribe

[1] Vidhyasaharan Sethu, Haris Bavattichalil, Eliathamby Ambikairajah, Haizhou Li, "A HIERARCHICAL FRAMEWORK FOR LANGUAGE IDENTIFICATION", IEEE SigPort, 2016. [Online]. Available: http://sigport.org/660. Accessed: Aug. 24, 2017.
@article{660-16,
url = {http://sigport.org/660},
author = {Vidhyasaharan Sethu; Haris Bavattichalil; Eliathamby Ambikairajah; Haizhou Li },
publisher = {IEEE SigPort},
title = {A HIERARCHICAL FRAMEWORK FOR LANGUAGE IDENTIFICATION},
year = {2016} }
TY - EJOUR
T1 - A HIERARCHICAL FRAMEWORK FOR LANGUAGE IDENTIFICATION
AU - Vidhyasaharan Sethu; Haris Bavattichalil; Eliathamby Ambikairajah; Haizhou Li
PY - 2016
PB - IEEE SigPort
UR - http://sigport.org/660
ER -
Vidhyasaharan Sethu, Haris Bavattichalil, Eliathamby Ambikairajah, Haizhou Li. (2016). A HIERARCHICAL FRAMEWORK FOR LANGUAGE IDENTIFICATION. IEEE SigPort. http://sigport.org/660
Vidhyasaharan Sethu, Haris Bavattichalil, Eliathamby Ambikairajah, Haizhou Li, 2016. A HIERARCHICAL FRAMEWORK FOR LANGUAGE IDENTIFICATION. Available at: http://sigport.org/660.
Vidhyasaharan Sethu, Haris Bavattichalil, Eliathamby Ambikairajah, Haizhou Li. (2016). "A HIERARCHICAL FRAMEWORK FOR LANGUAGE IDENTIFICATION." Web.
1. Vidhyasaharan Sethu, Haris Bavattichalil, Eliathamby Ambikairajah, Haizhou Li. A HIERARCHICAL FRAMEWORK FOR LANGUAGE IDENTIFICATION [Internet]. IEEE SigPort; 2016. Available from : http://sigport.org/660

Investigation on log-linear interpolation of multi-domain neural network language model


pp.pdf

PDF icon pp.pdf (143 downloads)

Paper Details

Authors:
Kazuki Irie, Ralf Schlüter, Hermann Ney
Submitted On:
30 March 2016 - 6:04am
Short Link:
Type:
Event:
Presenter's Name:
Document Year:
Cite

Document Files

pp.pdf

(143 downloads)

Keywords

Subscribe

[1] Kazuki Irie, Ralf Schlüter, Hermann Ney, "Investigation on log-linear interpolation of multi-domain neural network language model", IEEE SigPort, 2016. [Online]. Available: http://sigport.org/645. Accessed: Aug. 24, 2017.
@article{645-16,
url = {http://sigport.org/645},
author = {Kazuki Irie; Ralf Schlüter; Hermann Ney },
publisher = {IEEE SigPort},
title = {Investigation on log-linear interpolation of multi-domain neural network language model},
year = {2016} }
TY - EJOUR
T1 - Investigation on log-linear interpolation of multi-domain neural network language model
AU - Kazuki Irie; Ralf Schlüter; Hermann Ney
PY - 2016
PB - IEEE SigPort
UR - http://sigport.org/645
ER -
Kazuki Irie, Ralf Schlüter, Hermann Ney. (2016). Investigation on log-linear interpolation of multi-domain neural network language model. IEEE SigPort. http://sigport.org/645
Kazuki Irie, Ralf Schlüter, Hermann Ney, 2016. Investigation on log-linear interpolation of multi-domain neural network language model. Available at: http://sigport.org/645.
Kazuki Irie, Ralf Schlüter, Hermann Ney. (2016). "Investigation on log-linear interpolation of multi-domain neural network language model." Web.
1. Kazuki Irie, Ralf Schlüter, Hermann Ney. Investigation on log-linear interpolation of multi-domain neural network language model [Internet]. IEEE SigPort; 2016. Available from : http://sigport.org/645