Sorry, you need to enable JavaScript to visit this website.

Human Spoken Language Acquisition, Development and Learning (SLP-LADL)

SPEECH AUGMENTATION USING WAVENET IN SPEECH RECOGNITION


Data augmentation is crucial to improving the performance of deep neural networks by helping the model avoid overfitting and improve its generalization. In automatic speech recognition, previous work proposed several approaches to augment data by performing speed perturbation or spectral transformation. Since data augmented in these manners has similar acoustic representations with the original data, it has limited advantage in improving generalization of the acoustic model.

Paper Details

Authors:
Sangki Kim, Yeha Lee
Submitted On:
10 May 2019 - 9:55am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:

Document Files

#4788_SPEECH AUGMENTATION USING WAVENET IN SPEECH RECOGNITION.pdf

(7)

Subscribe

[1] Sangki Kim, Yeha Lee, "SPEECH AUGMENTATION USING WAVENET IN SPEECH RECOGNITION", IEEE SigPort, 2019. [Online]. Available: http://sigport.org/4337. Accessed: May. 23, 2019.
@article{4337-19,
url = {http://sigport.org/4337},
author = { Sangki Kim; Yeha Lee },
publisher = {IEEE SigPort},
title = {SPEECH AUGMENTATION USING WAVENET IN SPEECH RECOGNITION},
year = {2019} }
TY - EJOUR
T1 - SPEECH AUGMENTATION USING WAVENET IN SPEECH RECOGNITION
AU - Sangki Kim; Yeha Lee
PY - 2019
PB - IEEE SigPort
UR - http://sigport.org/4337
ER -
Sangki Kim, Yeha Lee. (2019). SPEECH AUGMENTATION USING WAVENET IN SPEECH RECOGNITION. IEEE SigPort. http://sigport.org/4337
Sangki Kim, Yeha Lee, 2019. SPEECH AUGMENTATION USING WAVENET IN SPEECH RECOGNITION. Available at: http://sigport.org/4337.
Sangki Kim, Yeha Lee. (2019). "SPEECH AUGMENTATION USING WAVENET IN SPEECH RECOGNITION." Web.
1. Sangki Kim, Yeha Lee. SPEECH AUGMENTATION USING WAVENET IN SPEECH RECOGNITION [Internet]. IEEE SigPort; 2019. Available from : http://sigport.org/4337

Models of visually grounded speech signal pay attention to nouns: a bilingual experiment on English and Japanese


We investigate the behaviour of attention in neural models of visually grounded speech trained on two languages: English and Japanese. Experimental results show that attention focuses on nouns and this behaviour holds true for two very typologically different languages. We also draw parallels between artificial neural attention and human attention and show that neural attention focuses on word endings as it has been theorised for human attention. Finally, we investigate how two visually grounded monolingual models can be used to perform cross-lingual speech-to-speech retrieval.

Paper Details

Authors:
Jean-Pierre Chevrot, Laurent Besacier
Submitted On:
8 May 2019 - 6:18am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

POSTER_ICASSP.pdf

(151)

ARTICLE_ICASSP2019.pdf

(2)

Subscribe

[1] Jean-Pierre Chevrot, Laurent Besacier, "Models of visually grounded speech signal pay attention to nouns: a bilingual experiment on English and Japanese", IEEE SigPort, 2019. [Online]. Available: http://sigport.org/4066. Accessed: May. 23, 2019.
@article{4066-19,
url = {http://sigport.org/4066},
author = {Jean-Pierre Chevrot; Laurent Besacier },
publisher = {IEEE SigPort},
title = {Models of visually grounded speech signal pay attention to nouns: a bilingual experiment on English and Japanese},
year = {2019} }
TY - EJOUR
T1 - Models of visually grounded speech signal pay attention to nouns: a bilingual experiment on English and Japanese
AU - Jean-Pierre Chevrot; Laurent Besacier
PY - 2019
PB - IEEE SigPort
UR - http://sigport.org/4066
ER -
Jean-Pierre Chevrot, Laurent Besacier. (2019). Models of visually grounded speech signal pay attention to nouns: a bilingual experiment on English and Japanese. IEEE SigPort. http://sigport.org/4066
Jean-Pierre Chevrot, Laurent Besacier, 2019. Models of visually grounded speech signal pay attention to nouns: a bilingual experiment on English and Japanese. Available at: http://sigport.org/4066.
Jean-Pierre Chevrot, Laurent Besacier. (2019). "Models of visually grounded speech signal pay attention to nouns: a bilingual experiment on English and Japanese." Web.
1. Jean-Pierre Chevrot, Laurent Besacier. Models of visually grounded speech signal pay attention to nouns: a bilingual experiment on English and Japanese [Internet]. IEEE SigPort; 2019. Available from : http://sigport.org/4066

Tongue Performance in Articulating Mandarin Apical Syllables by Prelingual Deaf Adults Using Ultrasonic Technology: Two Case Studies


In the present study, the ultrasonic data of two prelingual deaf participants were collected to observe tongue movements during the production of all the apical syllables under four citation tones except for \emph{ri} in Mandarin Chinese. Results of data analysis showed that, besides their personal characteristics, the two participants share similar problems in producing those apical syllables such as producing alveolar syllables as post-alveolar syllables, realizing affricates as fricatives, and unable to pronounce some types of apical syllables which they can perceive correctly.

Paper Details

Authors:
Quan Zhou, Yu Chen, Yanting Chen, Hao Zhang, Jianguo Wei, Jianwu Dang
Submitted On:
16 October 2016 - 11:58pm
Short Link:
Type:
Event:
Presenter's Name:
Document Year:
Cite

Document Files

Tongue Performance in Articulating Mandarin Apical Syllables by Prelingual Deaf Adults Using Ultrasonic Technology_poster.pdf

(63)

Subscribe

[1] Quan Zhou, Yu Chen, Yanting Chen, Hao Zhang, Jianguo Wei, Jianwu Dang, "Tongue Performance in Articulating Mandarin Apical Syllables by Prelingual Deaf Adults Using Ultrasonic Technology: Two Case Studies", IEEE SigPort, 2016. [Online]. Available: http://sigport.org/1258. Accessed: May. 23, 2019.
@article{1258-16,
url = {http://sigport.org/1258},
author = {Quan Zhou; Yu Chen; Yanting Chen; Hao Zhang; Jianguo Wei; Jianwu Dang },
publisher = {IEEE SigPort},
title = {Tongue Performance in Articulating Mandarin Apical Syllables by Prelingual Deaf Adults Using Ultrasonic Technology: Two Case Studies},
year = {2016} }
TY - EJOUR
T1 - Tongue Performance in Articulating Mandarin Apical Syllables by Prelingual Deaf Adults Using Ultrasonic Technology: Two Case Studies
AU - Quan Zhou; Yu Chen; Yanting Chen; Hao Zhang; Jianguo Wei; Jianwu Dang
PY - 2016
PB - IEEE SigPort
UR - http://sigport.org/1258
ER -
Quan Zhou, Yu Chen, Yanting Chen, Hao Zhang, Jianguo Wei, Jianwu Dang. (2016). Tongue Performance in Articulating Mandarin Apical Syllables by Prelingual Deaf Adults Using Ultrasonic Technology: Two Case Studies. IEEE SigPort. http://sigport.org/1258
Quan Zhou, Yu Chen, Yanting Chen, Hao Zhang, Jianguo Wei, Jianwu Dang, 2016. Tongue Performance in Articulating Mandarin Apical Syllables by Prelingual Deaf Adults Using Ultrasonic Technology: Two Case Studies. Available at: http://sigport.org/1258.
Quan Zhou, Yu Chen, Yanting Chen, Hao Zhang, Jianguo Wei, Jianwu Dang. (2016). "Tongue Performance in Articulating Mandarin Apical Syllables by Prelingual Deaf Adults Using Ultrasonic Technology: Two Case Studies." Web.
1. Quan Zhou, Yu Chen, Yanting Chen, Hao Zhang, Jianguo Wei, Jianwu Dang. Tongue Performance in Articulating Mandarin Apical Syllables by Prelingual Deaf Adults Using Ultrasonic Technology: Two Case Studies [Internet]. IEEE SigPort; 2016. Available from : http://sigport.org/1258

Investigation of the Effects of Automatic Scoring Technology on Human Raters' Performances in L2 Speech Proficiency Assessment


This study investigates how automatic scorings based on speech technology can affect human raters' judgement of students' oral language proficiency in L2 speaking tests. Automatic scorings based on ASR are widely used in non-critical speaking tests or practices and relatively high correlations between machine scores and human scores have been reported. In high-stakes speaking tests, however, many teachers remain skeptical about the fairness of automatic scores given by machines even with the most advanced scoring methods.

Paper Details

Authors:
Dean Luo, Wentao Gu, Ruxin Luo, Lixin Wang
Submitted On:
16 October 2016 - 11:17pm
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

Paper.No_.25.pptx

(386)

Subscribe

[1] Dean Luo, Wentao Gu, Ruxin Luo, Lixin Wang, "Investigation of the Effects of Automatic Scoring Technology on Human Raters' Performances in L2 Speech Proficiency Assessment", IEEE SigPort, 2016. [Online]. Available: http://sigport.org/1210. Accessed: May. 23, 2019.
@article{1210-16,
url = {http://sigport.org/1210},
author = {Dean Luo; Wentao Gu; Ruxin Luo; Lixin Wang },
publisher = {IEEE SigPort},
title = {Investigation of the Effects of Automatic Scoring Technology on Human Raters' Performances in L2 Speech Proficiency Assessment},
year = {2016} }
TY - EJOUR
T1 - Investigation of the Effects of Automatic Scoring Technology on Human Raters' Performances in L2 Speech Proficiency Assessment
AU - Dean Luo; Wentao Gu; Ruxin Luo; Lixin Wang
PY - 2016
PB - IEEE SigPort
UR - http://sigport.org/1210
ER -
Dean Luo, Wentao Gu, Ruxin Luo, Lixin Wang. (2016). Investigation of the Effects of Automatic Scoring Technology on Human Raters' Performances in L2 Speech Proficiency Assessment. IEEE SigPort. http://sigport.org/1210
Dean Luo, Wentao Gu, Ruxin Luo, Lixin Wang, 2016. Investigation of the Effects of Automatic Scoring Technology on Human Raters' Performances in L2 Speech Proficiency Assessment. Available at: http://sigport.org/1210.
Dean Luo, Wentao Gu, Ruxin Luo, Lixin Wang. (2016). "Investigation of the Effects of Automatic Scoring Technology on Human Raters' Performances in L2 Speech Proficiency Assessment." Web.
1. Dean Luo, Wentao Gu, Ruxin Luo, Lixin Wang. Investigation of the Effects of Automatic Scoring Technology on Human Raters' Performances in L2 Speech Proficiency Assessment [Internet]. IEEE SigPort; 2016. Available from : http://sigport.org/1210

Investigation of the Effects of Automatic Scoring Technology on Human Raters' Performances in L2 Speech Proficiency Assessment


This study investigates how automatic scorings based on speech technology can affect human raters' judgement of students' oral language proficiency in L2 speaking tests. Automatic scorings based on ASR are widely used in non-critical speaking tests or practices and relatively high correlations between machine scores and human scores have been reported. In high-stakes speaking tests, however, many teachers remain skeptical about the fairness of automatic scores given by machines even with the most advanced scoring methods.

Paper Details

Authors:
Dean Luo, Wentao Gu, Ruxin Luo, Lixin Wang
Submitted On:
14 October 2016 - 12:37pm
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

Paper.No_.25.pptx

(382)

Subscribe

[1] Dean Luo, Wentao Gu, Ruxin Luo, Lixin Wang, "Investigation of the Effects of Automatic Scoring Technology on Human Raters' Performances in L2 Speech Proficiency Assessment", IEEE SigPort, 2016. [Online]. Available: http://sigport.org/1209. Accessed: May. 23, 2019.
@article{1209-16,
url = {http://sigport.org/1209},
author = {Dean Luo; Wentao Gu; Ruxin Luo; Lixin Wang },
publisher = {IEEE SigPort},
title = {Investigation of the Effects of Automatic Scoring Technology on Human Raters' Performances in L2 Speech Proficiency Assessment},
year = {2016} }
TY - EJOUR
T1 - Investigation of the Effects of Automatic Scoring Technology on Human Raters' Performances in L2 Speech Proficiency Assessment
AU - Dean Luo; Wentao Gu; Ruxin Luo; Lixin Wang
PY - 2016
PB - IEEE SigPort
UR - http://sigport.org/1209
ER -
Dean Luo, Wentao Gu, Ruxin Luo, Lixin Wang. (2016). Investigation of the Effects of Automatic Scoring Technology on Human Raters' Performances in L2 Speech Proficiency Assessment. IEEE SigPort. http://sigport.org/1209
Dean Luo, Wentao Gu, Ruxin Luo, Lixin Wang, 2016. Investigation of the Effects of Automatic Scoring Technology on Human Raters' Performances in L2 Speech Proficiency Assessment. Available at: http://sigport.org/1209.
Dean Luo, Wentao Gu, Ruxin Luo, Lixin Wang. (2016). "Investigation of the Effects of Automatic Scoring Technology on Human Raters' Performances in L2 Speech Proficiency Assessment." Web.
1. Dean Luo, Wentao Gu, Ruxin Luo, Lixin Wang. Investigation of the Effects of Automatic Scoring Technology on Human Raters' Performances in L2 Speech Proficiency Assessment [Internet]. IEEE SigPort; 2016. Available from : http://sigport.org/1209

Rich Punctuations Prediction Using Large-scale Deep Learning


Punctuation plays an important role in language processing. However, automatic speech recognition systems only output plain word sequences. It is then of interest to predict punctuations on plain word sequences. Previous works have focused on using lexical features or prosodic cues captured from small corpus to predict simple punctuations. Compared with simple punctuations, rich punctuations provide more meaningful

Paper Details

Authors:
Xueyang Wu, Su Zhu, Yue Wu, and Kai Yu
Submitted On:
14 October 2016 - 2:50am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

This is a poster.

(397)

Subscribe

[1] Xueyang Wu, Su Zhu, Yue Wu, and Kai Yu, "Rich Punctuations Prediction Using Large-scale Deep Learning", IEEE SigPort, 2016. [Online]. Available: http://sigport.org/1181. Accessed: May. 23, 2019.
@article{1181-16,
url = {http://sigport.org/1181},
author = {Xueyang Wu; Su Zhu; Yue Wu; and Kai Yu },
publisher = {IEEE SigPort},
title = {Rich Punctuations Prediction Using Large-scale Deep Learning},
year = {2016} }
TY - EJOUR
T1 - Rich Punctuations Prediction Using Large-scale Deep Learning
AU - Xueyang Wu; Su Zhu; Yue Wu; and Kai Yu
PY - 2016
PB - IEEE SigPort
UR - http://sigport.org/1181
ER -
Xueyang Wu, Su Zhu, Yue Wu, and Kai Yu. (2016). Rich Punctuations Prediction Using Large-scale Deep Learning. IEEE SigPort. http://sigport.org/1181
Xueyang Wu, Su Zhu, Yue Wu, and Kai Yu, 2016. Rich Punctuations Prediction Using Large-scale Deep Learning. Available at: http://sigport.org/1181.
Xueyang Wu, Su Zhu, Yue Wu, and Kai Yu. (2016). "Rich Punctuations Prediction Using Large-scale Deep Learning." Web.
1. Xueyang Wu, Su Zhu, Yue Wu, and Kai Yu. Rich Punctuations Prediction Using Large-scale Deep Learning [Internet]. IEEE SigPort; 2016. Available from : http://sigport.org/1181

L1/L2 Difference in Phonological Sensitivity and Information Planning - Evidence from F0 Pattern


Assuming that linguistic specifications and information
planning contribute to different levels of prosodic organization
that cumulatively constitute output prosody, quantitative
analysis of respective contributions can be derived through
normalization procedures that remove levels of interactions
involved. The current study attempts to account for how L2
prosody departs from the L1 norm in the two levels mentioned
and whether an account can be offered. F0 patterns of word
English stress categories (primary, secondary and tertiary) and

Paper Details

Authors:
Chao-yu Su, Chiu-yu Tseng
Submitted On:
12 October 2016 - 2:03am
Short Link:
Type:
Event:
Presenter's Name:
Document Year:
Cite

Document Files

Final ISCSLP16_Poster.pdf

(75)

Subscribe

[1] Chao-yu Su, Chiu-yu Tseng , "L1/L2 Difference in Phonological Sensitivity and Information Planning - Evidence from F0 Pattern ", IEEE SigPort, 2016. [Online]. Available: http://sigport.org/1157. Accessed: May. 23, 2019.
@article{1157-16,
url = {http://sigport.org/1157},
author = {Chao-yu Su; Chiu-yu Tseng },
publisher = {IEEE SigPort},
title = {L1/L2 Difference in Phonological Sensitivity and Information Planning - Evidence from F0 Pattern },
year = {2016} }
TY - EJOUR
T1 - L1/L2 Difference in Phonological Sensitivity and Information Planning - Evidence from F0 Pattern
AU - Chao-yu Su; Chiu-yu Tseng
PY - 2016
PB - IEEE SigPort
UR - http://sigport.org/1157
ER -
Chao-yu Su, Chiu-yu Tseng . (2016). L1/L2 Difference in Phonological Sensitivity and Information Planning - Evidence from F0 Pattern . IEEE SigPort. http://sigport.org/1157
Chao-yu Su, Chiu-yu Tseng , 2016. L1/L2 Difference in Phonological Sensitivity and Information Planning - Evidence from F0 Pattern . Available at: http://sigport.org/1157.
Chao-yu Su, Chiu-yu Tseng . (2016). "L1/L2 Difference in Phonological Sensitivity and Information Planning - Evidence from F0 Pattern ." Web.
1. Chao-yu Su, Chiu-yu Tseng . L1/L2 Difference in Phonological Sensitivity and Information Planning - Evidence from F0 Pattern [Internet]. IEEE SigPort; 2016. Available from : http://sigport.org/1157