Sorry, you need to enable JavaScript to visit this website.

Audio and Acoustic Signal Processing

On Training Bi-directional Neural Network Language Model with Noise Contrastive Estimation


Although uni-directional recurrent neural network language
model(RNNLM) has been very successful, it’s hard to train a
bi-directional RNNLM properly due to the generative nature of
language model. In this work, we propose to train bi-directional
RNNLM with noise contrastive estimation(NCE), since the
properities of NCE training will help the model to acheieve
sentence-level normalization. Experiments are conducted on
two hand-crafted tasks on the PTB data set: a rescore task and

Paper Details

Authors:
Tianxing He, Yu Zhang, Jasha Droppo, Kai Yu
Submitted On:
16 October 2016 - 11:45am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

iscslp2016_poster_v2.pdf

(93 downloads)

Keywords

Subscribe

[1] Tianxing He, Yu Zhang, Jasha Droppo, Kai Yu, "On Training Bi-directional Neural Network Language Model with Noise Contrastive Estimation", IEEE SigPort, 2016. [Online]. Available: http://sigport.org/1255. Accessed: Jun. 29, 2017.
@article{1255-16,
url = {http://sigport.org/1255},
author = {Tianxing He; Yu Zhang; Jasha Droppo; Kai Yu },
publisher = {IEEE SigPort},
title = {On Training Bi-directional Neural Network Language Model with Noise Contrastive Estimation},
year = {2016} }
TY - EJOUR
T1 - On Training Bi-directional Neural Network Language Model with Noise Contrastive Estimation
AU - Tianxing He; Yu Zhang; Jasha Droppo; Kai Yu
PY - 2016
PB - IEEE SigPort
UR - http://sigport.org/1255
ER -
Tianxing He, Yu Zhang, Jasha Droppo, Kai Yu. (2016). On Training Bi-directional Neural Network Language Model with Noise Contrastive Estimation. IEEE SigPort. http://sigport.org/1255
Tianxing He, Yu Zhang, Jasha Droppo, Kai Yu, 2016. On Training Bi-directional Neural Network Language Model with Noise Contrastive Estimation. Available at: http://sigport.org/1255.
Tianxing He, Yu Zhang, Jasha Droppo, Kai Yu. (2016). "On Training Bi-directional Neural Network Language Model with Noise Contrastive Estimation." Web.
1. Tianxing He, Yu Zhang, Jasha Droppo, Kai Yu. On Training Bi-directional Neural Network Language Model with Noise Contrastive Estimation [Internet]. IEEE SigPort; 2016. Available from : http://sigport.org/1255

The Preliminary Study of Influence on Tone Perception from Segments

Paper Details

Authors:
Chong Cao, Yanlu Xie, Ju Lin, Qian Li, Jinsong Zhang
Submitted On:
15 October 2016 - 10:42pm
Short Link:
Type:
Event:
Paper Code:

Document Files

the preliminary study of influence on tone perception from segments.pdf

(94 downloads)

Keywords

Subscribe

[1] Chong Cao, Yanlu Xie, Ju Lin, Qian Li, Jinsong Zhang, "The Preliminary Study of Influence on Tone Perception from Segments", IEEE SigPort, 2016. [Online]. Available: http://sigport.org/1253. Accessed: Jun. 29, 2017.
@article{1253-16,
url = {http://sigport.org/1253},
author = {Chong Cao; Yanlu Xie; Ju Lin; Qian Li; Jinsong Zhang },
publisher = {IEEE SigPort},
title = {The Preliminary Study of Influence on Tone Perception from Segments},
year = {2016} }
TY - EJOUR
T1 - The Preliminary Study of Influence on Tone Perception from Segments
AU - Chong Cao; Yanlu Xie; Ju Lin; Qian Li; Jinsong Zhang
PY - 2016
PB - IEEE SigPort
UR - http://sigport.org/1253
ER -
Chong Cao, Yanlu Xie, Ju Lin, Qian Li, Jinsong Zhang. (2016). The Preliminary Study of Influence on Tone Perception from Segments. IEEE SigPort. http://sigport.org/1253
Chong Cao, Yanlu Xie, Ju Lin, Qian Li, Jinsong Zhang, 2016. The Preliminary Study of Influence on Tone Perception from Segments. Available at: http://sigport.org/1253.
Chong Cao, Yanlu Xie, Ju Lin, Qian Li, Jinsong Zhang. (2016). "The Preliminary Study of Influence on Tone Perception from Segments." Web.
1. Chong Cao, Yanlu Xie, Ju Lin, Qian Li, Jinsong Zhang. The Preliminary Study of Influence on Tone Perception from Segments [Internet]. IEEE SigPort; 2016. Available from : http://sigport.org/1253

A Study on perceptual training of Japanese CSL Learner to Discriminate Mandarin Lexical Tones


In process of learning Chinese as a second language (CSL), Japanese natives have difficulties in tone perception. Among the four Chinese lexical tones, the tone pairs Tone 1-Tone 2 and Tone 1-Tone 4 are problematic for Japanese CSL beginners. In order to help them develop efficiently discriminating capability of the tone pairs, we designed a hybrid perceptual training scheme which combined adaptive training and high variability phonetic training.

Paper Details

Authors:
Feiya Li, Yanlu Xie, Xiaomin Yu, Jinsong Zhang
Submitted On:
15 October 2016 - 12:55pm
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

ISCSLP-paper177-oral.pdf

(92 downloads)

Keywords

Additional Categories

Subscribe

[1] Feiya Li, Yanlu Xie, Xiaomin Yu, Jinsong Zhang, "A Study on perceptual training of Japanese CSL Learner to Discriminate Mandarin Lexical Tones", IEEE SigPort, 2016. [Online]. Available: http://sigport.org/1252. Accessed: Jun. 29, 2017.
@article{1252-16,
url = {http://sigport.org/1252},
author = { Feiya Li; Yanlu Xie; Xiaomin Yu; Jinsong Zhang },
publisher = {IEEE SigPort},
title = {A Study on perceptual training of Japanese CSL Learner to Discriminate Mandarin Lexical Tones},
year = {2016} }
TY - EJOUR
T1 - A Study on perceptual training of Japanese CSL Learner to Discriminate Mandarin Lexical Tones
AU - Feiya Li; Yanlu Xie; Xiaomin Yu; Jinsong Zhang
PY - 2016
PB - IEEE SigPort
UR - http://sigport.org/1252
ER -
Feiya Li, Yanlu Xie, Xiaomin Yu, Jinsong Zhang. (2016). A Study on perceptual training of Japanese CSL Learner to Discriminate Mandarin Lexical Tones. IEEE SigPort. http://sigport.org/1252
Feiya Li, Yanlu Xie, Xiaomin Yu, Jinsong Zhang, 2016. A Study on perceptual training of Japanese CSL Learner to Discriminate Mandarin Lexical Tones. Available at: http://sigport.org/1252.
Feiya Li, Yanlu Xie, Xiaomin Yu, Jinsong Zhang. (2016). "A Study on perceptual training of Japanese CSL Learner to Discriminate Mandarin Lexical Tones." Web.
1. Feiya Li, Yanlu Xie, Xiaomin Yu, Jinsong Zhang. A Study on perceptual training of Japanese CSL Learner to Discriminate Mandarin Lexical Tones [Internet]. IEEE SigPort; 2016. Available from : http://sigport.org/1252

Acoustic Correlates and Gender Effects in Production and Perception of Japanese Polite Speech


This study examines potential contribution of prosodic features and voice quality to the perception and production of Japanese polite speech as well as possible gender effects in politeness strategy.

Shi Shuju, Tsurutani Chiharu, Feng Xiaoli, Zhang Jinsong, Minematsu Nobuaki

Paper Details

Authors:
Chiharu Tsurutani, Xiaoli Feng, Jinsong Zhang, Nobuaki Minematsu
Submitted On:
15 October 2016 - 11:31am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

Japanese politness

(90 downloads)

Keywords

Subscribe

[1] Chiharu Tsurutani, Xiaoli Feng, Jinsong Zhang, Nobuaki Minematsu, "Acoustic Correlates and Gender Effects in Production and Perception of Japanese Polite Speech", IEEE SigPort, 2016. [Online]. Available: http://sigport.org/1246. Accessed: Jun. 29, 2017.
@article{1246-16,
url = {http://sigport.org/1246},
author = {Chiharu Tsurutani; Xiaoli Feng; Jinsong Zhang; Nobuaki Minematsu },
publisher = {IEEE SigPort},
title = {Acoustic Correlates and Gender Effects in Production and Perception of Japanese Polite Speech},
year = {2016} }
TY - EJOUR
T1 - Acoustic Correlates and Gender Effects in Production and Perception of Japanese Polite Speech
AU - Chiharu Tsurutani; Xiaoli Feng; Jinsong Zhang; Nobuaki Minematsu
PY - 2016
PB - IEEE SigPort
UR - http://sigport.org/1246
ER -
Chiharu Tsurutani, Xiaoli Feng, Jinsong Zhang, Nobuaki Minematsu. (2016). Acoustic Correlates and Gender Effects in Production and Perception of Japanese Polite Speech. IEEE SigPort. http://sigport.org/1246
Chiharu Tsurutani, Xiaoli Feng, Jinsong Zhang, Nobuaki Minematsu, 2016. Acoustic Correlates and Gender Effects in Production and Perception of Japanese Polite Speech. Available at: http://sigport.org/1246.
Chiharu Tsurutani, Xiaoli Feng, Jinsong Zhang, Nobuaki Minematsu. (2016). "Acoustic Correlates and Gender Effects in Production and Perception of Japanese Polite Speech." Web.
1. Chiharu Tsurutani, Xiaoli Feng, Jinsong Zhang, Nobuaki Minematsu. Acoustic Correlates and Gender Effects in Production and Perception of Japanese Polite Speech [Internet]. IEEE SigPort; 2016. Available from : http://sigport.org/1246

Automatic Detection of Rhythmic Patterns in Native and L2 Speech: Chinese, Japanese, and Japanese L2 Chinese


This study explores possible contribution of speech rhythm to foreign accent. We conducted statistical analysis and realized automatic detection of rhythmic patterns on Mandarin Chinese, Japanese and Japanese second language learners (L2) of Chinese using interval-based and amplitude-based measures.

rhythm.pdf

PDF icon L2 speech rhythm (85 downloads)

Paper Details

Authors:
Xiaoli Feng, Jingsong Zhang, Yanlu Xie
Submitted On:
15 October 2016 - 10:44am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

L2 speech rhythm

(85 downloads)

Keywords

Additional Categories

Subscribe

[1] Xiaoli Feng, Jingsong Zhang, Yanlu Xie, "Automatic Detection of Rhythmic Patterns in Native and L2 Speech: Chinese, Japanese, and Japanese L2 Chinese", IEEE SigPort, 2016. [Online]. Available: http://sigport.org/1240. Accessed: Jun. 29, 2017.
@article{1240-16,
url = {http://sigport.org/1240},
author = {Xiaoli Feng; Jingsong Zhang; Yanlu Xie },
publisher = {IEEE SigPort},
title = {Automatic Detection of Rhythmic Patterns in Native and L2 Speech: Chinese, Japanese, and Japanese L2 Chinese},
year = {2016} }
TY - EJOUR
T1 - Automatic Detection of Rhythmic Patterns in Native and L2 Speech: Chinese, Japanese, and Japanese L2 Chinese
AU - Xiaoli Feng; Jingsong Zhang; Yanlu Xie
PY - 2016
PB - IEEE SigPort
UR - http://sigport.org/1240
ER -
Xiaoli Feng, Jingsong Zhang, Yanlu Xie. (2016). Automatic Detection of Rhythmic Patterns in Native and L2 Speech: Chinese, Japanese, and Japanese L2 Chinese. IEEE SigPort. http://sigport.org/1240
Xiaoli Feng, Jingsong Zhang, Yanlu Xie, 2016. Automatic Detection of Rhythmic Patterns in Native and L2 Speech: Chinese, Japanese, and Japanese L2 Chinese. Available at: http://sigport.org/1240.
Xiaoli Feng, Jingsong Zhang, Yanlu Xie. (2016). "Automatic Detection of Rhythmic Patterns in Native and L2 Speech: Chinese, Japanese, and Japanese L2 Chinese." Web.
1. Xiaoli Feng, Jingsong Zhang, Yanlu Xie. Automatic Detection of Rhythmic Patterns in Native and L2 Speech: Chinese, Japanese, and Japanese L2 Chinese [Internet]. IEEE SigPort; 2016. Available from : http://sigport.org/1240

Applying Connectionist Temporal Classification Objective Function to Chinese Mandarin Speech Recognition


This paper establishs CTC-based systems on Chinese Mandarin ASR task, three different level output units are explored: characters, context independent phonemes and context dependent phoneme. To make training stable we propose Newbob-Trn strategy, furthermore, blank label prior cost is proposed to improve the performance. Further, we establish the CTC-trained UniLSTM-RC model, which ensures the real-time requirement of an online system, meanwhile, brings performance gain on Chinese Mandarin ASR task.

Paper Details

Authors:
Pengrui Wang,Jie Li,Bo Xu
Submitted On:
17 October 2016 - 11:07am
Short Link:
Type:
Event:
Presenter's Name:
Document Year:
Cite

Document Files

Applying Connectionist Temporal Classification Objective Function to Chinese Mandarin Speech Recognition.pptx

(77 downloads)

Keywords

Subscribe

[1] Pengrui Wang,Jie Li,Bo Xu, "Applying Connectionist Temporal Classification Objective Function to Chinese Mandarin Speech Recognition", IEEE SigPort, 2016. [Online]. Available: http://sigport.org/1231. Accessed: Jun. 29, 2017.
@article{1231-16,
url = {http://sigport.org/1231},
author = {Pengrui Wang;Jie Li;Bo Xu },
publisher = {IEEE SigPort},
title = {Applying Connectionist Temporal Classification Objective Function to Chinese Mandarin Speech Recognition},
year = {2016} }
TY - EJOUR
T1 - Applying Connectionist Temporal Classification Objective Function to Chinese Mandarin Speech Recognition
AU - Pengrui Wang;Jie Li;Bo Xu
PY - 2016
PB - IEEE SigPort
UR - http://sigport.org/1231
ER -
Pengrui Wang,Jie Li,Bo Xu. (2016). Applying Connectionist Temporal Classification Objective Function to Chinese Mandarin Speech Recognition. IEEE SigPort. http://sigport.org/1231
Pengrui Wang,Jie Li,Bo Xu, 2016. Applying Connectionist Temporal Classification Objective Function to Chinese Mandarin Speech Recognition. Available at: http://sigport.org/1231.
Pengrui Wang,Jie Li,Bo Xu. (2016). "Applying Connectionist Temporal Classification Objective Function to Chinese Mandarin Speech Recognition." Web.
1. Pengrui Wang,Jie Li,Bo Xu. Applying Connectionist Temporal Classification Objective Function to Chinese Mandarin Speech Recognition [Internet]. IEEE SigPort; 2016. Available from : http://sigport.org/1231

The influence of syllable structure and prosodic strengthening on consonant production in Shanghai Chinese

Paper Details

Authors:
Bijun Ling, Jie Liang
Submitted On:
15 October 2016 - 4:47am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

id201.pptx

(83 downloads)

Keywords

Subscribe

[1] Bijun Ling, Jie Liang, "The influence of syllable structure and prosodic strengthening on consonant production in Shanghai Chinese ", IEEE SigPort, 2016. [Online]. Available: http://sigport.org/1225. Accessed: Jun. 29, 2017.
@article{1225-16,
url = {http://sigport.org/1225},
author = {Bijun Ling; Jie Liang },
publisher = {IEEE SigPort},
title = {The influence of syllable structure and prosodic strengthening on consonant production in Shanghai Chinese },
year = {2016} }
TY - EJOUR
T1 - The influence of syllable structure and prosodic strengthening on consonant production in Shanghai Chinese
AU - Bijun Ling; Jie Liang
PY - 2016
PB - IEEE SigPort
UR - http://sigport.org/1225
ER -
Bijun Ling, Jie Liang. (2016). The influence of syllable structure and prosodic strengthening on consonant production in Shanghai Chinese . IEEE SigPort. http://sigport.org/1225
Bijun Ling, Jie Liang, 2016. The influence of syllable structure and prosodic strengthening on consonant production in Shanghai Chinese . Available at: http://sigport.org/1225.
Bijun Ling, Jie Liang. (2016). "The influence of syllable structure and prosodic strengthening on consonant production in Shanghai Chinese ." Web.
1. Bijun Ling, Jie Liang. The influence of syllable structure and prosodic strengthening on consonant production in Shanghai Chinese [Internet]. IEEE SigPort; 2016. Available from : http://sigport.org/1225

A REGRESSION APPROACH TO BINAURAL SPEECH SEGREGATION VIA DEEP NEURAL NETWORK


This paper proposes a novel regression approach to binaural speech segregation based on deep neural network (DNN). In contrast to the conventional ideal binary mask (IBM) method using DNN with the interaural time difference (ITD) and interaural level difference (ILD) as the auditory features, the log-power spectra (LPS) features of target speech are directly predicted via a regression DNN model by concatenating the monaural LPS features and the binaural features as the input.

Paper Details

Authors:
Nana Fan, Jun Du, Lirong Dai
Submitted On:
14 October 2016 - 11:07pm
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

oral-presentation3.pptx

(78 downloads)

oral-presentation3.pptx

(76 downloads)

Keywords

Additional Categories

Subscribe

[1] Nana Fan, Jun Du, Lirong Dai, "A REGRESSION APPROACH TO BINAURAL SPEECH SEGREGATION VIA DEEP NEURAL NETWORK", IEEE SigPort, 2016. [Online]. Available: http://sigport.org/1207. Accessed: Jun. 29, 2017.
@article{1207-16,
url = {http://sigport.org/1207},
author = {Nana Fan; Jun Du; Lirong Dai },
publisher = {IEEE SigPort},
title = {A REGRESSION APPROACH TO BINAURAL SPEECH SEGREGATION VIA DEEP NEURAL NETWORK},
year = {2016} }
TY - EJOUR
T1 - A REGRESSION APPROACH TO BINAURAL SPEECH SEGREGATION VIA DEEP NEURAL NETWORK
AU - Nana Fan; Jun Du; Lirong Dai
PY - 2016
PB - IEEE SigPort
UR - http://sigport.org/1207
ER -
Nana Fan, Jun Du, Lirong Dai. (2016). A REGRESSION APPROACH TO BINAURAL SPEECH SEGREGATION VIA DEEP NEURAL NETWORK. IEEE SigPort. http://sigport.org/1207
Nana Fan, Jun Du, Lirong Dai, 2016. A REGRESSION APPROACH TO BINAURAL SPEECH SEGREGATION VIA DEEP NEURAL NETWORK. Available at: http://sigport.org/1207.
Nana Fan, Jun Du, Lirong Dai. (2016). "A REGRESSION APPROACH TO BINAURAL SPEECH SEGREGATION VIA DEEP NEURAL NETWORK." Web.
1. Nana Fan, Jun Du, Lirong Dai. A REGRESSION APPROACH TO BINAURAL SPEECH SEGREGATION VIA DEEP NEURAL NETWORK [Internet]. IEEE SigPort; 2016. Available from : http://sigport.org/1207

The Design and Implementation of HMM-based Dai Speech Synthesis


By far there are more than 1.2 million Dai compatriots using Dai language in Yunnan province,researching Dai speech synthesis has great significance in advancing the informationization of Dai.This paper researches the implementation of Dai speech synthesis by taking the HMM speech synthesis framework and STRAIGHT synthesizer into account.
In this paper,collection and selection of Dai text corpus,recording of speech corpus,text normalization,segmentation,Romanization and the implementation of acoustic model training are described.

Paper Details

Authors:
Wang Zhan,Yang Jian,Yang xin
Submitted On:
14 October 2016 - 11:30am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

会议海报.pdf

(0)

Keywords

Subscribe

[1] Wang Zhan,Yang Jian,Yang xin, "The Design and Implementation of HMM-based Dai Speech Synthesis", IEEE SigPort, 2016. [Online]. Available: http://sigport.org/1204. Accessed: Jun. 29, 2017.
@article{1204-16,
url = {http://sigport.org/1204},
author = {Wang Zhan;Yang Jian;Yang xin },
publisher = {IEEE SigPort},
title = {The Design and Implementation of HMM-based Dai Speech Synthesis},
year = {2016} }
TY - EJOUR
T1 - The Design and Implementation of HMM-based Dai Speech Synthesis
AU - Wang Zhan;Yang Jian;Yang xin
PY - 2016
PB - IEEE SigPort
UR - http://sigport.org/1204
ER -
Wang Zhan,Yang Jian,Yang xin. (2016). The Design and Implementation of HMM-based Dai Speech Synthesis. IEEE SigPort. http://sigport.org/1204
Wang Zhan,Yang Jian,Yang xin, 2016. The Design and Implementation of HMM-based Dai Speech Synthesis. Available at: http://sigport.org/1204.
Wang Zhan,Yang Jian,Yang xin. (2016). "The Design and Implementation of HMM-based Dai Speech Synthesis." Web.
1. Wang Zhan,Yang Jian,Yang xin. The Design and Implementation of HMM-based Dai Speech Synthesis [Internet]. IEEE SigPort; 2016. Available from : http://sigport.org/1204

Contributions of the Piriform Fossa of Female Speakers to Vowel Spectra


The bilateral cavities of the piriform fossa are the side branches of the vocal tract and produce anti-resonance(s) in the transfer function. This effect has been known for male vocal tracts, but female data were few. This study investigates contributions of the piriform fossa to vowel spectra in female vocal tracts by means of MRI-based vocal-tract modeling and acoustic experiment with the water-filling technique. Results from three female subjects indicate that the piriform fossa generates one or two dips in the frequency region of 4-6 kHz.

Paper Details

Authors:
Congcong Zhang, Kiyoshi Honda, Ju Zhang, Jianguo Wei
Submitted On:
15 October 2016 - 12:24am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

zcc_ISCSLP2016.pdf

(67 downloads)

Keywords

Subscribe

[1] Congcong Zhang, Kiyoshi Honda, Ju Zhang, Jianguo Wei, "Contributions of the Piriform Fossa of Female Speakers to Vowel Spectra", IEEE SigPort, 2016. [Online]. Available: http://sigport.org/1203. Accessed: Jun. 29, 2017.
@article{1203-16,
url = {http://sigport.org/1203},
author = {Congcong Zhang; Kiyoshi Honda; Ju Zhang; Jianguo Wei },
publisher = {IEEE SigPort},
title = {Contributions of the Piriform Fossa of Female Speakers to Vowel Spectra},
year = {2016} }
TY - EJOUR
T1 - Contributions of the Piriform Fossa of Female Speakers to Vowel Spectra
AU - Congcong Zhang; Kiyoshi Honda; Ju Zhang; Jianguo Wei
PY - 2016
PB - IEEE SigPort
UR - http://sigport.org/1203
ER -
Congcong Zhang, Kiyoshi Honda, Ju Zhang, Jianguo Wei. (2016). Contributions of the Piriform Fossa of Female Speakers to Vowel Spectra. IEEE SigPort. http://sigport.org/1203
Congcong Zhang, Kiyoshi Honda, Ju Zhang, Jianguo Wei, 2016. Contributions of the Piriform Fossa of Female Speakers to Vowel Spectra. Available at: http://sigport.org/1203.
Congcong Zhang, Kiyoshi Honda, Ju Zhang, Jianguo Wei. (2016). "Contributions of the Piriform Fossa of Female Speakers to Vowel Spectra." Web.
1. Congcong Zhang, Kiyoshi Honda, Ju Zhang, Jianguo Wei. Contributions of the Piriform Fossa of Female Speakers to Vowel Spectra [Internet]. IEEE SigPort; 2016. Available from : http://sigport.org/1203

Pages