ISCSLP 2016

Welcome to ISCSLP 2016 - October 17-20, 2016, Tianjin, China

The ISCSLP will be hosted by Tianjin University. Tianjin has a reputation throughout China for being extremely friendly, safe and a place of delicious food. Welcome to Tianjin to attend the ISCSLP2016. The 10th International Symposium on Chinese Spoken Language Processing (ISCSLP 2016) will be held on October 17-20, 2016 in Tianjin. ISCSLP is a biennial conference for scientists, researchers, and practitioners to report and discuss the latest progress in all theoretical and technological aspects of spoken language processing. While the ISCSLP is focused primarily on Chinese languages, works on other languages that may be applied to Chinese speech and language are also encouraged. The working language of ISCSLP is English.

Investigation of the Effects of Automatic Scoring Technology on Human Raters' Performances in L2 Speech Proficiency Assessment

This study investigates how automatic scorings based on speech technology can affect human raters' judgement of students' oral language proficiency in L2 speaking tests. Automatic scorings based on ASR are widely used in non-critical speaking tests or practices and relatively high correlations between machine scores and human scores have been reported. In high-stakes speaking tests, however, many teachers remain skeptical about the fairness of automatic scores given by machines even with the most advanced scoring methods.

Paper.No_.25.pptx

Paper.No_.25.pptx (876)

Categories:: Human Spoken Language Acquisition, Development and Learning (SLP-LADL)

12 Views

Investigation of the Effects of Automatic Scoring Technology on Human Raters' Performances in L2 Speech Proficiency Assessment

Paper.No_.25.pptx

Paper.No_.25.pptx (388)

Categories:: Human Spoken Language Acquisition, Development and Learning (SLP-LADL)

22 Views

Rapid Speaker Adaptation Based on D-code Extracted from BLSTM-RNN in LVCSR

Read more about Rapid Speaker Adaptation Based on D-code Extracted from BLSTM-RNN in LVCSR
Log in to post comments

Recently, several fast speaker adaptation methods have been proposed for the hybrid DNN-HMM models based on the so called discriminative speaker codes (SC) and applied to unsupervised speaker adaptation in speech recognition. It has been demonstrated that the SC based methods are quite effective in adapting DNNs even when only a very small amount of adaptation data is available. However, in this way we have to estimate speaker code for new speakers by an updating process and obtain the final results through two-pass decoding.

Rapid Speaker Adaptation Based on D-code Extracted from BLSTM-RNN in LVCSR.pdf

Rapid Speaker Adaptation Based on D-code Extracted from BLSTM-RNN in LVCSR.pdf (72)

Categories:: Speech Adaptation/Normalization (SPE-ADAP)

12 Views

A REGRESSION APPROACH TO BINAURAL SPEECH SEGREGATION VIA DEEP NEURAL NETWORK

Read more about A REGRESSION APPROACH TO BINAURAL SPEECH SEGREGATION VIA DEEP NEURAL NETWORK
Log in to post comments

This paper proposes a novel regression approach to binaural speech segregation based on deep neural network (DNN). In contrast to the conventional ideal binary mask (IBM) method using DNN with the interaural time difference (ITD) and interaural level difference (ILD) as the auditory features, the log-power spectra (LPS) features of target speech are directly predicted via a regression DNN model by concatenating the monaural LPS features and the binaural features as the input.

oral-presentation3.pptx

oral-presentation3.pptx (882)

oral-presentation3.pptx

oral-presentation3.pptx (724)

Categories:: Audio and Acoustic Signal Processing

6 Views

Digit-dependent Local I-Vector for Text-Prompted Speaker Verification with

Read more about Digit-dependent Local I-Vector for Text-Prompted Speaker Verification with
Log in to post comments

The widely adopted i-vector performances well in textindependent speaker verification with long speech duration.

ISCSLP2016_PeixinChen.pdf

ISCSLP2016_PeixinChen.pdf (737)

Categories:: Speaker Recognition and Characterization (SPE-SPKR)

28 Views

Exploring Tonal Information for Lhasa Dialect Acoustic Modeling

Read more about Exploring Tonal Information for Lhasa Dialect Acoustic Modeling
Log in to post comments

Detailed analysis of tonal features for Tibetan Lhasa dialect is an important task for Tibetan automatic speech recognition (ASR) applications. However, it is difficult to utilize tonal information because it remains controversial how many tonal patterns the Lhasa dialect has. Therefore, few studies have focused on modeling the tonal information of the Lhasa dialect for speech recognition purpose. For this reason, we investigated influences of the tonal information on the performance of Lhasa Tibetan speech recognition.

Poster_Exploring Tonal Information for Lhasa Dialect Acoustic Modeling.pdf

Exploring Tonal Information for Lhasa Dialect Acoustic Modeling (65)

Categories:: Acoustic Modeling for Automatic Speech Recognition (SPE-RECO)

13 Views

The Design and Implementation of HMM-based Dai Speech Synthesis

Read more about The Design and Implementation of HMM-based Dai Speech Synthesis
Log in to post comments

By far there are more than 1.2 million Dai compatriots using Dai language in Yunnan province,researching Dai speech synthesis has great significance in advancing the informationization of Dai.This paper researches the implementation of Dai speech synthesis by taking the HMM speech synthesis framework and STRAIGHT synthesizer into account.
In this paper,collection and selection of Dai text corpus,recording of speech corpus,text normalization,segmentation,Romanization and the implementation of acoustic model training are described.

会议海报.pdf

会议海报.pdf (80)

Categories:: Audio and Acoustic Signal Processing

9 Views

Contributions of the Piriform Fossa of Female Speakers to Vowel Spectra

Read more about Contributions of the Piriform Fossa of Female Speakers to Vowel Spectra
Log in to post comments

The bilateral cavities of the piriform fossa are the side branches of the vocal tract and produce anti-resonance(s) in the transfer function. This effect has been known for male vocal tracts, but female data were few. This study investigates contributions of the piriform fossa to vowel spectra in female vocal tracts by means of MRI-based vocal-tract modeling and acoustic experiment with the water-filling technique. Results from three female subjects indicate that the piriform fossa generates one or two dips in the frequency region of 4-6 kHz.

zcc_ISCSLP2016.pdf

zcc_ISCSLP2016.pdf (831)

Categories:: Audio and Acoustic Signal Processing
Speech Production (SPE-SPRD)

9 Views

Individual difference and acoustic effect of female laryngeal cavities

Read more about Individual difference and acoustic effect of female laryngeal cavities
Log in to post comments

This study examines the acoustic effect of the laryngeal cavity of female speakers on the higher vowel spectra. To do so, MRI data of vowels /a/ and /i/ obtained from three female speakers were analyzed with data from a male speaker as reference. 3D vocal-tract shapes were extracted from the MRI data and printed as solid mechanical models. Transfer functions of the models' vocal tracts were estimated by a transmission line model. Individual variations of the laryngeal cavity were described by the area functions of the cavity.