Sorry, you need to enable JavaScript to visit this website.

Acoustic Modeling for Automatic Speech Recognition (SPE-RECO)

Compact Kernel Models for Acoustic Modeling via Random Feature Selection


A simple but effective method is proposed for learning compact random feature models that approximate non-linear kernel methods, in the context of acoustic modeling. The method is able to explore a large number of non-linear features while maintaining a compact model via feature selection more efficiently than existing approaches. For certain kernels, this random feature selection may be regarded as a means of non-linear feature selection at the level of the raw input features, which motivates additional methods for computational improvements.

Paper Details

Authors:
Michael Collins, Daniel Hsu, Brian Kingsbury
Submitted On:
23 March 2016 - 7:16pm
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

ICASSP_Poster_Avner_May_v3.pdf

(467)

Subscribe

[1] Michael Collins, Daniel Hsu, Brian Kingsbury, "Compact Kernel Models for Acoustic Modeling via Random Feature Selection", IEEE SigPort, 2016. [Online]. Available: http://sigport.org/1005. Accessed: Jun. 19, 2019.
@article{1005-16,
url = {http://sigport.org/1005},
author = {Michael Collins; Daniel Hsu; Brian Kingsbury },
publisher = {IEEE SigPort},
title = {Compact Kernel Models for Acoustic Modeling via Random Feature Selection},
year = {2016} }
TY - EJOUR
T1 - Compact Kernel Models for Acoustic Modeling via Random Feature Selection
AU - Michael Collins; Daniel Hsu; Brian Kingsbury
PY - 2016
PB - IEEE SigPort
UR - http://sigport.org/1005
ER -
Michael Collins, Daniel Hsu, Brian Kingsbury. (2016). Compact Kernel Models for Acoustic Modeling via Random Feature Selection. IEEE SigPort. http://sigport.org/1005
Michael Collins, Daniel Hsu, Brian Kingsbury, 2016. Compact Kernel Models for Acoustic Modeling via Random Feature Selection. Available at: http://sigport.org/1005.
Michael Collins, Daniel Hsu, Brian Kingsbury. (2016). "Compact Kernel Models for Acoustic Modeling via Random Feature Selection." Web.
1. Michael Collins, Daniel Hsu, Brian Kingsbury. Compact Kernel Models for Acoustic Modeling via Random Feature Selection [Internet]. IEEE SigPort; 2016. Available from : http://sigport.org/1005

Discriminatively Trained Joint Speaker and Environment Representations for Adaptation of Deep Neural Network Acoustic Models

Paper Details

Authors:
Sunil Sivadas, Kai Yu, Bin Ma
Submitted On:
22 March 2016 - 11:46am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

mfy-icassp16-slides.pdf

(347)

Subscribe

[1] Sunil Sivadas, Kai Yu, Bin Ma, "Discriminatively Trained Joint Speaker and Environment Representations for Adaptation of Deep Neural Network Acoustic Models", IEEE SigPort, 2016. [Online]. Available: http://sigport.org/972. Accessed: Jun. 19, 2019.
@article{972-16,
url = {http://sigport.org/972},
author = {Sunil Sivadas; Kai Yu; Bin Ma },
publisher = {IEEE SigPort},
title = {Discriminatively Trained Joint Speaker and Environment Representations for Adaptation of Deep Neural Network Acoustic Models},
year = {2016} }
TY - EJOUR
T1 - Discriminatively Trained Joint Speaker and Environment Representations for Adaptation of Deep Neural Network Acoustic Models
AU - Sunil Sivadas; Kai Yu; Bin Ma
PY - 2016
PB - IEEE SigPort
UR - http://sigport.org/972
ER -
Sunil Sivadas, Kai Yu, Bin Ma. (2016). Discriminatively Trained Joint Speaker and Environment Representations for Adaptation of Deep Neural Network Acoustic Models. IEEE SigPort. http://sigport.org/972
Sunil Sivadas, Kai Yu, Bin Ma, 2016. Discriminatively Trained Joint Speaker and Environment Representations for Adaptation of Deep Neural Network Acoustic Models. Available at: http://sigport.org/972.
Sunil Sivadas, Kai Yu, Bin Ma. (2016). "Discriminatively Trained Joint Speaker and Environment Representations for Adaptation of Deep Neural Network Acoustic Models." Web.
1. Sunil Sivadas, Kai Yu, Bin Ma. Discriminatively Trained Joint Speaker and Environment Representations for Adaptation of Deep Neural Network Acoustic Models [Internet]. IEEE SigPort; 2016. Available from : http://sigport.org/972

Recurrent SVM for Speech Recognition

Paper Details

Authors:
Shi-Xiong Zhang; Rui Zhao; Jinyu Li; Chaojun Liu; Yifan Gong
Submitted On:
19 March 2016 - 4:50am
Short Link:
Type:
Event:
Presenter's Name:
Document Year:
Cite

Document Files

RecurrentSVM_poster.pdf

(349)

Subscribe

[1] Shi-Xiong Zhang; Rui Zhao; Jinyu Li; Chaojun Liu; Yifan Gong, "Recurrent SVM for Speech Recognition", IEEE SigPort, 2016. [Online]. Available: http://sigport.org/792. Accessed: Jun. 19, 2019.
@article{792-16,
url = {http://sigport.org/792},
author = {Shi-Xiong Zhang; Rui Zhao; Jinyu Li; Chaojun Liu; Yifan Gong },
publisher = {IEEE SigPort},
title = {Recurrent SVM for Speech Recognition},
year = {2016} }
TY - EJOUR
T1 - Recurrent SVM for Speech Recognition
AU - Shi-Xiong Zhang; Rui Zhao; Jinyu Li; Chaojun Liu; Yifan Gong
PY - 2016
PB - IEEE SigPort
UR - http://sigport.org/792
ER -
Shi-Xiong Zhang; Rui Zhao; Jinyu Li; Chaojun Liu; Yifan Gong. (2016). Recurrent SVM for Speech Recognition. IEEE SigPort. http://sigport.org/792
Shi-Xiong Zhang; Rui Zhao; Jinyu Li; Chaojun Liu; Yifan Gong, 2016. Recurrent SVM for Speech Recognition. Available at: http://sigport.org/792.
Shi-Xiong Zhang; Rui Zhao; Jinyu Li; Chaojun Liu; Yifan Gong. (2016). "Recurrent SVM for Speech Recognition." Web.
1. Shi-Xiong Zhang; Rui Zhao; Jinyu Li; Chaojun Liu; Yifan Gong. Recurrent SVM for Speech Recognition [Internet]. IEEE SigPort; 2016. Available from : http://sigport.org/792

On Training the Recurrent Neural Network Encoder-Decoder for Large Vocabulary End-to-end Speech Recognition


Recently, there has been an increasing interest in end-to-end speech
recognition using neural networks, with no reliance on hidden
Markov models (HMMs) for sequence modelling as in the standard
hybrid framework. The recurrent neural network (RNN) encoder-decoder
is such a model, performing sequence to sequence mapping
without any predefined alignment. This model first transforms the
input sequence into a fixed length vector representation, from which
the decoder recovers the output sequence. In this paper, we extend

Paper Details

Authors:
Liang Lu, Xingxing Zhang, Steve Renals
Submitted On:
18 March 2016 - 12:52pm
Short Link:
Type:
Event:
Presenter's Name:
Document Year:
Cite

Document Files

liang_icassp16_slides.pdf

(345)

Subscribe

[1] Liang Lu, Xingxing Zhang, Steve Renals, "On Training the Recurrent Neural Network Encoder-Decoder for Large Vocabulary End-to-end Speech Recognition", IEEE SigPort, 2016. [Online]. Available: http://sigport.org/769. Accessed: Jun. 19, 2019.
@article{769-16,
url = {http://sigport.org/769},
author = {Liang Lu; Xingxing Zhang; Steve Renals },
publisher = {IEEE SigPort},
title = {On Training the Recurrent Neural Network Encoder-Decoder for Large Vocabulary End-to-end Speech Recognition},
year = {2016} }
TY - EJOUR
T1 - On Training the Recurrent Neural Network Encoder-Decoder for Large Vocabulary End-to-end Speech Recognition
AU - Liang Lu; Xingxing Zhang; Steve Renals
PY - 2016
PB - IEEE SigPort
UR - http://sigport.org/769
ER -
Liang Lu, Xingxing Zhang, Steve Renals. (2016). On Training the Recurrent Neural Network Encoder-Decoder for Large Vocabulary End-to-end Speech Recognition. IEEE SigPort. http://sigport.org/769
Liang Lu, Xingxing Zhang, Steve Renals, 2016. On Training the Recurrent Neural Network Encoder-Decoder for Large Vocabulary End-to-end Speech Recognition. Available at: http://sigport.org/769.
Liang Lu, Xingxing Zhang, Steve Renals. (2016). "On Training the Recurrent Neural Network Encoder-Decoder for Large Vocabulary End-to-end Speech Recognition." Web.
1. Liang Lu, Xingxing Zhang, Steve Renals. On Training the Recurrent Neural Network Encoder-Decoder for Large Vocabulary End-to-end Speech Recognition [Internet]. IEEE SigPort; 2016. Available from : http://sigport.org/769

Deep convolutional acoustic word embeddings using word-pair side information


Recent studies have been revisiting whole words as the basic modelling unit in speech recognition and query applications, instead of phonetic units. Such whole-word segmental systems rely on a function that maps a variable-length speech segment to a vector in a fixed-dimensional space; the resulting acoustic word embeddings need to allow for accurate discrimination between different word types, directly in the embedding space. We compare several old and new approaches in a word discrimination task.

Paper Details

Authors:
Herman Kamper, Weiran Wang, Karen Livescu
Submitted On:
17 March 2016 - 4:42am
Short Link:
Type:
Event:
Presenter's Name:
Document Year:
Cite

Document Files

kamper+wang+livescu_icassp2016_talk.pdf

(86)

Subscribe

[1] Herman Kamper, Weiran Wang, Karen Livescu, "Deep convolutional acoustic word embeddings using word-pair side information", IEEE SigPort, 2016. [Online]. Available: http://sigport.org/740. Accessed: Jun. 19, 2019.
@article{740-16,
url = {http://sigport.org/740},
author = {Herman Kamper; Weiran Wang; Karen Livescu },
publisher = {IEEE SigPort},
title = {Deep convolutional acoustic word embeddings using word-pair side information},
year = {2016} }
TY - EJOUR
T1 - Deep convolutional acoustic word embeddings using word-pair side information
AU - Herman Kamper; Weiran Wang; Karen Livescu
PY - 2016
PB - IEEE SigPort
UR - http://sigport.org/740
ER -
Herman Kamper, Weiran Wang, Karen Livescu. (2016). Deep convolutional acoustic word embeddings using word-pair side information. IEEE SigPort. http://sigport.org/740
Herman Kamper, Weiran Wang, Karen Livescu, 2016. Deep convolutional acoustic word embeddings using word-pair side information. Available at: http://sigport.org/740.
Herman Kamper, Weiran Wang, Karen Livescu. (2016). "Deep convolutional acoustic word embeddings using word-pair side information." Web.
1. Herman Kamper, Weiran Wang, Karen Livescu. Deep convolutional acoustic word embeddings using word-pair side information [Internet]. IEEE SigPort; 2016. Available from : http://sigport.org/740

Two-Stage Noise Aware Training Using Asymmetric Deep Denoising Autoencoder


Ever since the deep neural network (DNN)-based acoustic model appeared, the recognition performance of automatic peech recognition has been greatly improved. Due to this achievement, various researches on DNN-based technique for noise robustness are also in progress. Among these approaches, the noise-aware training (NAT) technique which aims to improve the inherent robustness of DNN using noise estimates has shown remarkable performance. However, despite the great performance, we cannot be certain whether NAT is an optimal method for sufficiently utilizing the inherent robustness of DNN.

Paper Details

Authors:
Shin Jae Kang, Woo Hyun Kang, Nam Soo Kim
Submitted On:
17 March 2016 - 1:45am
Short Link:
Type:
Event:
Presenter's Name:
Document Year:
Cite

Document Files

ICASSP2016_포스터_이강현_그래프2.pdf

(69)

Subscribe

[1] Shin Jae Kang, Woo Hyun Kang, Nam Soo Kim, "Two-Stage Noise Aware Training Using Asymmetric Deep Denoising Autoencoder", IEEE SigPort, 2016. [Online]. Available: http://sigport.org/734. Accessed: Jun. 19, 2019.
@article{734-16,
url = {http://sigport.org/734},
author = {Shin Jae Kang; Woo Hyun Kang; Nam Soo Kim },
publisher = {IEEE SigPort},
title = {Two-Stage Noise Aware Training Using Asymmetric Deep Denoising Autoencoder},
year = {2016} }
TY - EJOUR
T1 - Two-Stage Noise Aware Training Using Asymmetric Deep Denoising Autoencoder
AU - Shin Jae Kang; Woo Hyun Kang; Nam Soo Kim
PY - 2016
PB - IEEE SigPort
UR - http://sigport.org/734
ER -
Shin Jae Kang, Woo Hyun Kang, Nam Soo Kim. (2016). Two-Stage Noise Aware Training Using Asymmetric Deep Denoising Autoencoder. IEEE SigPort. http://sigport.org/734
Shin Jae Kang, Woo Hyun Kang, Nam Soo Kim, 2016. Two-Stage Noise Aware Training Using Asymmetric Deep Denoising Autoencoder. Available at: http://sigport.org/734.
Shin Jae Kang, Woo Hyun Kang, Nam Soo Kim. (2016). "Two-Stage Noise Aware Training Using Asymmetric Deep Denoising Autoencoder." Web.
1. Shin Jae Kang, Woo Hyun Kang, Nam Soo Kim. Two-Stage Noise Aware Training Using Asymmetric Deep Denoising Autoencoder [Internet]. IEEE SigPort; 2016. Available from : http://sigport.org/734

Pages