Sorry, you need to enable JavaScript to visit this website.

ISCSLP 2016

ISCSLP 2016

Welcome to ISCSLP 2016 - October 17-20, 2016, Tianjin, China

The ISCSLP will be hosted by Tianjin University. Tianjin has a reputation throughout China for being extremely friendly, safe and a place of delicious food. Welcome to Tianjin to attend the ISCSLP2016. The 10th International Symposium on Chinese Spoken Language Processing (ISCSLP 2016) will be held on October 17-20, 2016 in Tianjin. ISCSLP is a biennial conference for scientists, researchers, and practitioners to report and discuss the latest progress in all theoretical and technological aspects of spoken language processing. While the ISCSLP is focused primarily on Chinese languages, works on other languages that may be applied to Chinese speech and language are also encouraged. The working language of ISCSLP is English.

 

A multi-channel/multi-speaker interactive 3D Audio-Visual Speech Corpus in Mandarin


This paper presents a multi-channel/multi-speaker 3D audiovisual
corpus for Mandarin continuous speech recognition and
other fields, such as speech visualization and speech synthesis.
This corpus consists of 24 speakers with about 18k utterances,
about 20 hours in total. For each utterance, the audio
streams were recorded by two professional microphones in
near-field and far-field respectively, while a marker-based 3D
facial motion capturing system with six infrared cameras was

Paper Details

Authors:
Jun Yu, Rongfeng Su, Lan Wang, Wenpeng Zhou
Submitted On:
14 October 2016 - 10:40am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

3D Audio-Visual Speech Corpus in Mandarin

(188 downloads)

Keywords

Subscribe

[1] Jun Yu, Rongfeng Su, Lan Wang, Wenpeng Zhou, "A multi-channel/multi-speaker interactive 3D Audio-Visual Speech Corpus in Mandarin", IEEE SigPort, 2016. [Online]. Available: http://sigport.org/1200. Accessed: Jan. 20, 2018.
@article{1200-16,
url = {http://sigport.org/1200},
author = {Jun Yu; Rongfeng Su; Lan Wang; Wenpeng Zhou },
publisher = {IEEE SigPort},
title = {A multi-channel/multi-speaker interactive 3D Audio-Visual Speech Corpus in Mandarin},
year = {2016} }
TY - EJOUR
T1 - A multi-channel/multi-speaker interactive 3D Audio-Visual Speech Corpus in Mandarin
AU - Jun Yu; Rongfeng Su; Lan Wang; Wenpeng Zhou
PY - 2016
PB - IEEE SigPort
UR - http://sigport.org/1200
ER -
Jun Yu, Rongfeng Su, Lan Wang, Wenpeng Zhou. (2016). A multi-channel/multi-speaker interactive 3D Audio-Visual Speech Corpus in Mandarin. IEEE SigPort. http://sigport.org/1200
Jun Yu, Rongfeng Su, Lan Wang, Wenpeng Zhou, 2016. A multi-channel/multi-speaker interactive 3D Audio-Visual Speech Corpus in Mandarin. Available at: http://sigport.org/1200.
Jun Yu, Rongfeng Su, Lan Wang, Wenpeng Zhou. (2016). "A multi-channel/multi-speaker interactive 3D Audio-Visual Speech Corpus in Mandarin." Web.
1. Jun Yu, Rongfeng Su, Lan Wang, Wenpeng Zhou. A multi-channel/multi-speaker interactive 3D Audio-Visual Speech Corpus in Mandarin [Internet]. IEEE SigPort; 2016. Available from : http://sigport.org/1200

Unsupervised Speaker Adaptation of BLSTM-RNN for LVCSR Based on Speaker Code


Recently, the speaker code based adaptation has been successfully expanded to recurrent neural networks using bidirectional Long Short-Term Memory (BLSTM-RNN) [1]. Experiments on the small-scale TIMIT task have demonstrated that the speaker code based adaptation is also valid for BLSTM-RNN. In this paper, we evaluate this method on large-scale task and introduce an error normalization method to balance the back-propagation errors derived from different layers for speaker codes. Meanwhile, we use singular value decomposition (SVD) method to conduct model compression.

Paper Details

Authors:
Zhiying Huang, Shaofei Xue, Zhijie Yan, Lirong Dai
Submitted On:
14 October 2016 - 10:15am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

ISCSLP_presentation_ZhiyingHuang_upload.pdf

(184 downloads)

Keywords

Subscribe

[1] Zhiying Huang, Shaofei Xue, Zhijie Yan, Lirong Dai, "Unsupervised Speaker Adaptation of BLSTM-RNN for LVCSR Based on Speaker Code", IEEE SigPort, 2016. [Online]. Available: http://sigport.org/1198. Accessed: Jan. 20, 2018.
@article{1198-16,
url = {http://sigport.org/1198},
author = {Zhiying Huang; Shaofei Xue; Zhijie Yan; Lirong Dai },
publisher = {IEEE SigPort},
title = {Unsupervised Speaker Adaptation of BLSTM-RNN for LVCSR Based on Speaker Code},
year = {2016} }
TY - EJOUR
T1 - Unsupervised Speaker Adaptation of BLSTM-RNN for LVCSR Based on Speaker Code
AU - Zhiying Huang; Shaofei Xue; Zhijie Yan; Lirong Dai
PY - 2016
PB - IEEE SigPort
UR - http://sigport.org/1198
ER -
Zhiying Huang, Shaofei Xue, Zhijie Yan, Lirong Dai. (2016). Unsupervised Speaker Adaptation of BLSTM-RNN for LVCSR Based on Speaker Code. IEEE SigPort. http://sigport.org/1198
Zhiying Huang, Shaofei Xue, Zhijie Yan, Lirong Dai, 2016. Unsupervised Speaker Adaptation of BLSTM-RNN for LVCSR Based on Speaker Code. Available at: http://sigport.org/1198.
Zhiying Huang, Shaofei Xue, Zhijie Yan, Lirong Dai. (2016). "Unsupervised Speaker Adaptation of BLSTM-RNN for LVCSR Based on Speaker Code." Web.
1. Zhiying Huang, Shaofei Xue, Zhijie Yan, Lirong Dai. Unsupervised Speaker Adaptation of BLSTM-RNN for LVCSR Based on Speaker Code [Internet]. IEEE SigPort; 2016. Available from : http://sigport.org/1198

DNN-Based Unit Selection Using Frame-Sized Speech Segments


This paper presents a deep neural network (DNN)-based unit selection method for waveform concatenation speech synthesis using frame-sized speech segments. In this method, three DNNs are adopted to calculate target costs and concatenation costs respectively for selecting frame-sized candidate units. The first DNN is built in the same way as the DNN-based statistical parametric speech synthesis, which predicts target acoustic features given linguistic context inputs.

Paper Details

Authors:
Zhen-Hua Ling
Submitted On:
14 October 2016 - 9:24am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

ISCLSP2016_zpzhou_presentation.pdf

(174 downloads)

Keywords

Subscribe

[1] Zhen-Hua Ling, "DNN-Based Unit Selection Using Frame-Sized Speech Segments", IEEE SigPort, 2016. [Online]. Available: http://sigport.org/1197. Accessed: Jan. 20, 2018.
@article{1197-16,
url = {http://sigport.org/1197},
author = {Zhen-Hua Ling },
publisher = {IEEE SigPort},
title = {DNN-Based Unit Selection Using Frame-Sized Speech Segments},
year = {2016} }
TY - EJOUR
T1 - DNN-Based Unit Selection Using Frame-Sized Speech Segments
AU - Zhen-Hua Ling
PY - 2016
PB - IEEE SigPort
UR - http://sigport.org/1197
ER -
Zhen-Hua Ling. (2016). DNN-Based Unit Selection Using Frame-Sized Speech Segments. IEEE SigPort. http://sigport.org/1197
Zhen-Hua Ling, 2016. DNN-Based Unit Selection Using Frame-Sized Speech Segments. Available at: http://sigport.org/1197.
Zhen-Hua Ling. (2016). "DNN-Based Unit Selection Using Frame-Sized Speech Segments." Web.
1. Zhen-Hua Ling. DNN-Based Unit Selection Using Frame-Sized Speech Segments [Internet]. IEEE SigPort; 2016. Available from : http://sigport.org/1197

Long Short-term Memory Recurrent Neural Network based Segment Features for Music Genre Classification


In the conventional frame feature based music genre
classification methods, the audio data is represented by
independent frames and the sequential nature of audio is totally
ignored. If the sequential knowledge is well modeled and
combined, the classification performance can be significantly
improved. The long short-term memory(LSTM) recurrent
neural network (RNN) which uses a set of special memory
cells to model for long-range feature sequence, has been
successfully used for many sequence labeling and sequence

Paper Details

Authors:
Jia Dai, Shan Liang, Wei Xue, Chongjia Ni, Wenju Liu
Submitted On:
14 October 2016 - 9:18am
Short Link:
Type:
Event:
Presenter's Name:
Document Year:
Cite

Document Files

ISCSLP2016_JiaDai_pptA4.pdf

(194 downloads)

Keywords

Additional Categories

Subscribe

[1] Jia Dai, Shan Liang, Wei Xue, Chongjia Ni, Wenju Liu, "Long Short-term Memory Recurrent Neural Network based Segment Features for Music Genre Classification", IEEE SigPort, 2016. [Online]. Available: http://sigport.org/1195. Accessed: Jan. 20, 2018.
@article{1195-16,
url = {http://sigport.org/1195},
author = {Jia Dai; Shan Liang; Wei Xue; Chongjia Ni; Wenju Liu },
publisher = {IEEE SigPort},
title = {Long Short-term Memory Recurrent Neural Network based Segment Features for Music Genre Classification},
year = {2016} }
TY - EJOUR
T1 - Long Short-term Memory Recurrent Neural Network based Segment Features for Music Genre Classification
AU - Jia Dai; Shan Liang; Wei Xue; Chongjia Ni; Wenju Liu
PY - 2016
PB - IEEE SigPort
UR - http://sigport.org/1195
ER -
Jia Dai, Shan Liang, Wei Xue, Chongjia Ni, Wenju Liu. (2016). Long Short-term Memory Recurrent Neural Network based Segment Features for Music Genre Classification. IEEE SigPort. http://sigport.org/1195
Jia Dai, Shan Liang, Wei Xue, Chongjia Ni, Wenju Liu, 2016. Long Short-term Memory Recurrent Neural Network based Segment Features for Music Genre Classification. Available at: http://sigport.org/1195.
Jia Dai, Shan Liang, Wei Xue, Chongjia Ni, Wenju Liu. (2016). "Long Short-term Memory Recurrent Neural Network based Segment Features for Music Genre Classification." Web.
1. Jia Dai, Shan Liang, Wei Xue, Chongjia Ni, Wenju Liu. Long Short-term Memory Recurrent Neural Network based Segment Features for Music Genre Classification [Internet]. IEEE SigPort; 2016. Available from : http://sigport.org/1195

Deep Neural Network for Robust Speech Recognition With Auxiliary Features From Laser-Doppler Vibrometer Sensor

Paper Details

Authors:
Jun Du, Ian McLoughlin, Yong Xu, Feng Ma, Haikun Wang
Submitted On:
15 October 2016 - 1:32am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

ISCSLP2016--Zhipeng_Xie_LDVs.pptx

(145 downloads)

ISCSLP2016--Zhipeng_Xie_LDVs.pdf

(152 downloads)

Keywords

Subscribe

[1] Jun Du, Ian McLoughlin, Yong Xu, Feng Ma, Haikun Wang, "Deep Neural Network for Robust Speech Recognition With Auxiliary Features From Laser-Doppler Vibrometer Sensor", IEEE SigPort, 2016. [Online]. Available: http://sigport.org/1194. Accessed: Jan. 20, 2018.
@article{1194-16,
url = {http://sigport.org/1194},
author = {Jun Du; Ian McLoughlin; Yong Xu; Feng Ma; Haikun Wang },
publisher = {IEEE SigPort},
title = {Deep Neural Network for Robust Speech Recognition With Auxiliary Features From Laser-Doppler Vibrometer Sensor},
year = {2016} }
TY - EJOUR
T1 - Deep Neural Network for Robust Speech Recognition With Auxiliary Features From Laser-Doppler Vibrometer Sensor
AU - Jun Du; Ian McLoughlin; Yong Xu; Feng Ma; Haikun Wang
PY - 2016
PB - IEEE SigPort
UR - http://sigport.org/1194
ER -
Jun Du, Ian McLoughlin, Yong Xu, Feng Ma, Haikun Wang. (2016). Deep Neural Network for Robust Speech Recognition With Auxiliary Features From Laser-Doppler Vibrometer Sensor. IEEE SigPort. http://sigport.org/1194
Jun Du, Ian McLoughlin, Yong Xu, Feng Ma, Haikun Wang, 2016. Deep Neural Network for Robust Speech Recognition With Auxiliary Features From Laser-Doppler Vibrometer Sensor. Available at: http://sigport.org/1194.
Jun Du, Ian McLoughlin, Yong Xu, Feng Ma, Haikun Wang. (2016). "Deep Neural Network for Robust Speech Recognition With Auxiliary Features From Laser-Doppler Vibrometer Sensor." Web.
1. Jun Du, Ian McLoughlin, Yong Xu, Feng Ma, Haikun Wang. Deep Neural Network for Robust Speech Recognition With Auxiliary Features From Laser-Doppler Vibrometer Sensor [Internet]. IEEE SigPort; 2016. Available from : http://sigport.org/1194

Vector Taylor Series Expansion with Auditory Masking for Noise Robust Speech Recognition


In this paper, we address the problem of speech recognition in
the presence of additive noise. We investigate the applicability
and efficacy of auditory masking in devising a robust front end
for noisy features. This is achieved by introducing a masking
factor into the Vector Taylor Series (VTS) equations. The resultant
first order VTS approximation is used to compensate the parameters
of a clean speech model and a Minimum Mean Square
Error (MMSE) estimate is used to estimate the clean speech

Paper17_BD.pdf

PDF icon Paper17_BD.pdf (140 downloads)

Paper Details

Authors:
Biswajit Das, Ashish Panda
Submitted On:
14 October 2016 - 8:15am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

Paper17_BD.pdf

(140 downloads)

Keywords

Subscribe

[1] Biswajit Das, Ashish Panda, "Vector Taylor Series Expansion with Auditory Masking for Noise Robust Speech Recognition", IEEE SigPort, 2016. [Online]. Available: http://sigport.org/1193. Accessed: Jan. 20, 2018.
@article{1193-16,
url = {http://sigport.org/1193},
author = {Biswajit Das; Ashish Panda },
publisher = {IEEE SigPort},
title = {Vector Taylor Series Expansion with Auditory Masking for Noise Robust Speech Recognition},
year = {2016} }
TY - EJOUR
T1 - Vector Taylor Series Expansion with Auditory Masking for Noise Robust Speech Recognition
AU - Biswajit Das; Ashish Panda
PY - 2016
PB - IEEE SigPort
UR - http://sigport.org/1193
ER -
Biswajit Das, Ashish Panda. (2016). Vector Taylor Series Expansion with Auditory Masking for Noise Robust Speech Recognition. IEEE SigPort. http://sigport.org/1193
Biswajit Das, Ashish Panda, 2016. Vector Taylor Series Expansion with Auditory Masking for Noise Robust Speech Recognition. Available at: http://sigport.org/1193.
Biswajit Das, Ashish Panda. (2016). "Vector Taylor Series Expansion with Auditory Masking for Noise Robust Speech Recognition." Web.
1. Biswajit Das, Ashish Panda. Vector Taylor Series Expansion with Auditory Masking for Noise Robust Speech Recognition [Internet]. IEEE SigPort; 2016. Available from : http://sigport.org/1193

Vector Taylor Series Expansion with Auditory Masking for Noise Robust Speech Recognition


In this paper, we address the problem of speech recognition in
the presence of additive noise. We investigate the applicability
and efficacy of auditory masking in devising a robust front end
for noisy features. This is achieved by introducing a masking
factor into the Vector Taylor Series (VTS) equations. The resultant
first order VTS approximation is used to compensate the parameters
of a clean speech model and a Minimum Mean Square
Error (MMSE) estimate is used to estimate the clean speech

Paper17_BD.pdf

PDF icon Paper17_BD.pdf (157 downloads)

Paper Details

Authors:
Ashish Panda
Submitted On:
14 October 2016 - 8:15am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

Paper17_BD.pdf

(157 downloads)

Keywords

Subscribe

[1] Ashish Panda, "Vector Taylor Series Expansion with Auditory Masking for Noise Robust Speech Recognition", IEEE SigPort, 2016. [Online]. Available: http://sigport.org/1192. Accessed: Jan. 20, 2018.
@article{1192-16,
url = {http://sigport.org/1192},
author = {Ashish Panda },
publisher = {IEEE SigPort},
title = {Vector Taylor Series Expansion with Auditory Masking for Noise Robust Speech Recognition},
year = {2016} }
TY - EJOUR
T1 - Vector Taylor Series Expansion with Auditory Masking for Noise Robust Speech Recognition
AU - Ashish Panda
PY - 2016
PB - IEEE SigPort
UR - http://sigport.org/1192
ER -
Ashish Panda. (2016). Vector Taylor Series Expansion with Auditory Masking for Noise Robust Speech Recognition. IEEE SigPort. http://sigport.org/1192
Ashish Panda, 2016. Vector Taylor Series Expansion with Auditory Masking for Noise Robust Speech Recognition. Available at: http://sigport.org/1192.
Ashish Panda. (2016). "Vector Taylor Series Expansion with Auditory Masking for Noise Robust Speech Recognition." Web.
1. Ashish Panda. Vector Taylor Series Expansion with Auditory Masking for Noise Robust Speech Recognition [Internet]. IEEE SigPort; 2016. Available from : http://sigport.org/1192

Employing Median Filtering to Enhance the Complex-valued Acoustic Spectrograms in Modulation Domain for Noise-robust Speech Recognition

Paper Details

Authors:
Hsin-Ju Hsieh, Berlin Chen, Jeih-weih Hung
Submitted On:
14 October 2016 - 6:28am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

ISCSLP_2016.pdf

(149 downloads)

Keywords

Subscribe

[1] Hsin-Ju Hsieh, Berlin Chen, Jeih-weih Hung, "Employing Median Filtering to Enhance the Complex-valued Acoustic Spectrograms in Modulation Domain for Noise-robust Speech Recognition", IEEE SigPort, 2016. [Online]. Available: http://sigport.org/1189. Accessed: Jan. 20, 2018.
@article{1189-16,
url = {http://sigport.org/1189},
author = {Hsin-Ju Hsieh; Berlin Chen; Jeih-weih Hung },
publisher = {IEEE SigPort},
title = {Employing Median Filtering to Enhance the Complex-valued Acoustic Spectrograms in Modulation Domain for Noise-robust Speech Recognition},
year = {2016} }
TY - EJOUR
T1 - Employing Median Filtering to Enhance the Complex-valued Acoustic Spectrograms in Modulation Domain for Noise-robust Speech Recognition
AU - Hsin-Ju Hsieh; Berlin Chen; Jeih-weih Hung
PY - 2016
PB - IEEE SigPort
UR - http://sigport.org/1189
ER -
Hsin-Ju Hsieh, Berlin Chen, Jeih-weih Hung. (2016). Employing Median Filtering to Enhance the Complex-valued Acoustic Spectrograms in Modulation Domain for Noise-robust Speech Recognition. IEEE SigPort. http://sigport.org/1189
Hsin-Ju Hsieh, Berlin Chen, Jeih-weih Hung, 2016. Employing Median Filtering to Enhance the Complex-valued Acoustic Spectrograms in Modulation Domain for Noise-robust Speech Recognition. Available at: http://sigport.org/1189.
Hsin-Ju Hsieh, Berlin Chen, Jeih-weih Hung. (2016). "Employing Median Filtering to Enhance the Complex-valued Acoustic Spectrograms in Modulation Domain for Noise-robust Speech Recognition." Web.
1. Hsin-Ju Hsieh, Berlin Chen, Jeih-weih Hung. Employing Median Filtering to Enhance the Complex-valued Acoustic Spectrograms in Modulation Domain for Noise-robust Speech Recognition [Internet]. IEEE SigPort; 2016. Available from : http://sigport.org/1189

Towards Automatic Assessment of Aphasia Speech Using Automatic Speech Recognition Techniques


Aphasia is a type of acquired language impairment caused by brain injury. This paper presents an automatic speech recog- nition (ASR) based approach to objective assessment of apha- sia patients. A dedicated ASR system is developed to facilitate acoustical and linguistic analysis of Cantonese aphasia speech. The acoustic models and the language models are trained with domain- and style-matched speech data from unimpaired con- trol speakers. The speech recognition performance of this sys- tem is evaluated on natural oral discourses from patients with various types of aphasia.

Paper Details

Authors:
Ying Qin, Tan Lee, Anthony Pak Hin Kong, Sam Po Law
Submitted On:
14 October 2016 - 5:51am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

conference-10.18.pdf

(180 downloads)

Keywords

Subscribe

[1] Ying Qin, Tan Lee, Anthony Pak Hin Kong, Sam Po Law, "Towards Automatic Assessment of Aphasia Speech Using Automatic Speech Recognition Techniques", IEEE SigPort, 2016. [Online]. Available: http://sigport.org/1187. Accessed: Jan. 20, 2018.
@article{1187-16,
url = {http://sigport.org/1187},
author = {Ying Qin; Tan Lee; Anthony Pak Hin Kong; Sam Po Law },
publisher = {IEEE SigPort},
title = {Towards Automatic Assessment of Aphasia Speech Using Automatic Speech Recognition Techniques},
year = {2016} }
TY - EJOUR
T1 - Towards Automatic Assessment of Aphasia Speech Using Automatic Speech Recognition Techniques
AU - Ying Qin; Tan Lee; Anthony Pak Hin Kong; Sam Po Law
PY - 2016
PB - IEEE SigPort
UR - http://sigport.org/1187
ER -
Ying Qin, Tan Lee, Anthony Pak Hin Kong, Sam Po Law. (2016). Towards Automatic Assessment of Aphasia Speech Using Automatic Speech Recognition Techniques. IEEE SigPort. http://sigport.org/1187
Ying Qin, Tan Lee, Anthony Pak Hin Kong, Sam Po Law, 2016. Towards Automatic Assessment of Aphasia Speech Using Automatic Speech Recognition Techniques. Available at: http://sigport.org/1187.
Ying Qin, Tan Lee, Anthony Pak Hin Kong, Sam Po Law. (2016). "Towards Automatic Assessment of Aphasia Speech Using Automatic Speech Recognition Techniques." Web.
1. Ying Qin, Tan Lee, Anthony Pak Hin Kong, Sam Po Law. Towards Automatic Assessment of Aphasia Speech Using Automatic Speech Recognition Techniques [Internet]. IEEE SigPort; 2016. Available from : http://sigport.org/1187

Mismatched Training Data Enhancement for Automatic Recognition of Children’s Speech using DNN-HMM


The increasing profusion of commercial automatic speech recognition technology applications has been driven by big-data techniques, making use of high quality labelled speech datasets. Children’s speech displays greater time and frequency domain variability than typical adult speech, lacks the depth and breadth of training material, and presents difficulties relating to capture quality. All of these factors act to reduce the achievable performance of systems that recognise children’s speech.

Paper Details

Authors:
Ian McLoughlin, Wu Guo, Lirong Dai
Submitted On:
14 October 2016 - 5:48am
Short Link:
Type:
Event:
Document Year:
Cite

Document Files

ISCSLP_poster(MengjieQian) .pdf

(119 downloads)

Keywords

Additional Categories

Subscribe

[1] Ian McLoughlin, Wu Guo, Lirong Dai, "Mismatched Training Data Enhancement for Automatic Recognition of Children’s Speech using DNN-HMM", IEEE SigPort, 2016. [Online]. Available: http://sigport.org/1186. Accessed: Jan. 20, 2018.
@article{1186-16,
url = {http://sigport.org/1186},
author = {Ian McLoughlin; Wu Guo; Lirong Dai },
publisher = {IEEE SigPort},
title = {Mismatched Training Data Enhancement for Automatic Recognition of Children’s Speech using DNN-HMM},
year = {2016} }
TY - EJOUR
T1 - Mismatched Training Data Enhancement for Automatic Recognition of Children’s Speech using DNN-HMM
AU - Ian McLoughlin; Wu Guo; Lirong Dai
PY - 2016
PB - IEEE SigPort
UR - http://sigport.org/1186
ER -
Ian McLoughlin, Wu Guo, Lirong Dai. (2016). Mismatched Training Data Enhancement for Automatic Recognition of Children’s Speech using DNN-HMM. IEEE SigPort. http://sigport.org/1186
Ian McLoughlin, Wu Guo, Lirong Dai, 2016. Mismatched Training Data Enhancement for Automatic Recognition of Children’s Speech using DNN-HMM. Available at: http://sigport.org/1186.
Ian McLoughlin, Wu Guo, Lirong Dai. (2016). "Mismatched Training Data Enhancement for Automatic Recognition of Children’s Speech using DNN-HMM." Web.
1. Ian McLoughlin, Wu Guo, Lirong Dai. Mismatched Training Data Enhancement for Automatic Recognition of Children’s Speech using DNN-HMM [Internet]. IEEE SigPort; 2016. Available from : http://sigport.org/1186

Pages