ISCSLP 2016

Welcome to ISCSLP 2016 - October 17-20, 2016, Tianjin, China

The ISCSLP will be hosted by Tianjin University. Tianjin has a reputation throughout China for being extremely friendly, safe and a place of delicious food. Welcome to Tianjin to attend the ISCSLP2016. The 10th International Symposium on Chinese Spoken Language Processing (ISCSLP 2016) will be held on October 17-20, 2016 in Tianjin. ISCSLP is a biennial conference for scientists, researchers, and practitioners to report and discuss the latest progress in all theoretical and technological aspects of spoken language processing. While the ISCSLP is focused primarily on Chinese languages, works on other languages that may be applied to Chinese speech and language are also encouraged. The working language of ISCSLP is English.

The Perception of the English Alveolar-velar Nasal Coda Contrast by Monolingual versus Bilingual Chinese Speakers

Relatively little research has addressed the role of L1 in the
perception of English speech contrasts by Chinese learners of
English as L3. The present study investigates the role of L1 in
the perception of the English alveolar-velar nasal coda contrast
(/n/ vs. /ŋ/) after the vowels /i ʌ æ/ by bilingual Changsha
Chinese speakers, whose L1 is Changsha Chinese and L2 is
Standard Mandarin. Changsha Chinese only permits an
alveolar nasal coda /n/, while Standard Mandarin permits both
final /n/ and /ŋ/. We examined whether or not monolingual

10.12Tianjing.pdf

L3 speech perception by Chinese speakers (392)

Categories:: Speech Perception and Psychoacoustics (SPE-SPER)

23 Views

Cross-corpus Speech Emotion Recognition Using Transfer Semi-supervised Discriminant Analysis

tsda_peng.pdf

tsda_peng.pdf (339)

Categories:: Speech Perception and Psychoacoustics (SPE-SPER)

5 Views

Cross-corpus Speech Emotion Recognition Using Transfer Semi-supervised Discriminant Analysis

tsda_peng.pdf

tsda_peng.pdf (307)

Categories:: Speech Perception and Psychoacoustics (SPE-SPER)

10 Views

L1/L2 Difference in Phonological Sensitivity and Information Planning - Evidence from F0 Pattern

Assuming that linguistic specifications and information
planning contribute to different levels of prosodic organization
that cumulatively constitute output prosody, quantitative
analysis of respective contributions can be derived through
normalization procedures that remove levels of interactions
involved. The current study attempts to account for how L2
prosody departs from the L1 norm in the two levels mentioned
and whether an account can be offered. F0 patterns of word
English stress categories (primary, secondary and tertiary) and

Final ISCSLP16_Poster.pdf

Final ISCSLP16_Poster.pdf (75)

Categories:: Speech Production (SPE-SPRD)
Human Spoken Language Acquisition, Development and Learning (SLP-LADL)

7 Views

Cluster-Based Senone Selection for the Efficient Calculation of Deep Neural Network Acoustic Models

This is oral presentation at ISCSLP, for more information, please refer to paper:

Jun-Hua Liu, Zhen-Hua Ling, Si Wei, Guo-Ping Hu, Li-Rong Dai, "Cluster-Based Senone Selection for the Efficient Calculation of Deep Neural Network Acoustic Models", ISCSLP, 2016.

20161001_dnn_cluster_v2.pptx

20161001_dnn_cluster_v2.pptx (789)

Categories:: Audio Processing Systems

13 Views

Study on the Relation of Fundamental and Formant Frequencies for Affective Speech Synthesis

Directions into Velocities of Articulators (DIVA) model is a kind of self-adaptive neural network model which controls movements of a simulated vocal tract to produce words, syllables or phonemes. However, DIVA model lacks of emotion functions. To implement the emotion function in DIVA model, we investigate the process of affective speech production based on the combination of fundamental frequency (F0) and formant frequencies, as well as the relations between F0 and formants of emotional speech.

ISCSLP_POSTER_20161010.pdf

poster (319)

Categories:: Audio and Acoustic Signal Processing

13 Views

Cantonese Spoken Word Retention by Speakers with and without Congenital Amusia: Implications from Phonological Similarity and Cognitive Load Effects

Success in spoken word processing relies not only on accurate word recognition but also the veracity with which words are maintained in memory. However, research on word retention is still scarce, especially in tonal languages and phonologically impaired populations. To address these gaps, the present study administered an auditory order recall task to native Cantonese speakers with and without amusia. Stimuli intrinsic (segmental similarity, suprasegmental similarity, and lexicality) and extrinsic (cognitive load) factors were manipulated.

Cantonese Spoken Word Retention by Speakers with and without Congenital Amusia.pdf

Cantonese Spoken Word Retention by Speakers with and without Congenital Amusia.pdf (70)

Categories:: Speech Perception and Psychoacoustics (SPE-SPER)

7 Views

Speech Enhancement with Binaural Cues Derived from a Priori Codebook

Read more about Speech Enhancement with Binaural Cues Derived from a Priori Codebook
Log in to post comments

In conventional codebook-driven speech enhancement, only spectral envelopes of speech and noise are considered, and at the same time, the type of noise is the priori information when we enhance the noisy speech. In this paper, we propose a novel codebook-based speech enhancement method which exploits a priori information about binaural cues, including clean cue and pre-enhanced cue, stored in the trained codebook. This method includes two main parts: offline training of cues and online enhancement by means of cues.

ISLSLP2016 陈楠.ppt

ISLSLP2016 陈楠.ppt (74)

Categories:: Source Separation and Signal Enhancement

14 Views

A post-thyroidectomy voice quality study in patients suffering or not from Recurrent Laryngeal paralysis

The main object of this study is voice quality after total thyroidectomy (which involves complete removal of the thyroid gland) or isthmolobectomie (which involves removal of the half, right or left, portions of the gland). This often causes degradation of voice quality permanently or temporarily. Voice quality will be studied using aerodynamic cues. From an aerodynamic point of view, oral airflow (Oaf) and maximum phonation time (TMP) were observed.

Für Tianjin 2016.pdf

Für Tianjin 2016.pdf (70)

Categories:: Audio and Acoustic Signal Processing

7 Views

Pages