ISCSLP 2016

Welcome to ISCSLP 2016 - October 17-20, 2016, Tianjin, China

The ISCSLP will be hosted by Tianjin University. Tianjin has a reputation throughout China for being extremely friendly, safe and a place of delicious food. Welcome to Tianjin to attend the ISCSLP2016. The 10th International Symposium on Chinese Spoken Language Processing (ISCSLP 2016) will be held on October 17-20, 2016 in Tianjin. ISCSLP is a biennial conference for scientists, researchers, and practitioners to report and discuss the latest progress in all theoretical and technological aspects of spoken language processing. While the ISCSLP is focused primarily on Chinese languages, works on other languages that may be applied to Chinese speech and language are also encouraged. The working language of ISCSLP is English.

Pronunciation Error Detection using DNN Articulatory Model based on Multi-lingual and Multi-task Learning

poster-v2.pdf

poster-v2.pdf (305)

Categories:: Audio and Acoustic Signal Processing

7 Views

Microphone Array Speech Denoising Modeled by Tensor Filtering

Read more about Microphone Array Speech Denoising Modeled by Tensor Filtering
Log in to post comments

This paper proposes a novel speech denoising method based on tensor filtering, in which the microphone array speech signal is constructed by tensor data and processed by tensor filtering model. The multi-microphone signal is represented with three-order tensor space in the way of channel, time and frequency. Noise can be reduced by finding the lower-rank approximation of the three-order tensor with tucker model. MDL (Minimum Description Length) criterion is used to estimate the optimal tensor rank.

12090poster.pdf

Microphone Array Speech Denoising Modeled by Tensor Filtering (328)

Categories:: Loudspeaker and Microphone Array Signal Processing

18 Views

Recognition of spoken words in L2 speech using L1 probabilistic phonotactics: Evidence from Cantonese-English bilinguals

iscslp2016_2.ppt

iscslp2016_2.ppt (370)

Categories:: Spoken Language Processing

18 Views

Exploiting Language-Mismatched Phoneme Recognizers for Unsupervised Acoustic Modeling

Read more about Exploiting Language-Mismatched Phoneme Recognizers for Unsupervised Acoustic Modeling
Log in to post comments

This paper describes an investigation on acoustic modeling in the absence of transcribed training data. We propose to use language-mismatched phoneme recognizers to assist unsupervised segmentation and segment clustering of a new language. Using a language-mismatched recognizer, an input utterance is divided into many variable-length segments. Each segment is represented by a feature vector that is derived from the phoneme posterior probabilities.

slides.pdf

slides.pdf (885)

Categories:: Acoustic Modeling for Automatic Speech Recognition (SPE-RECO)

22 Views

First Investigation of Universal Speech Attributes for Speaker Verification

Read more about First Investigation of Universal Speech Attributes for Speaker Verification
Log in to post comments

The universal speech attributes to speaker verification (SV) is addressed in this paper. The manner and place of articulation form the universal attribute unit inventory, and deep neural network (DNN) is used as acoustic model.

ISCSLP-张圣.pdf

ISCSLP-张圣.pdf (74)

Categories:: Speaker Recognition and Characterization (SPE-SPKR)

8 Views

Dictionary Update for NMF-based Voice Conversion Using anEncoder-Decoder Network

Read more about Dictionary Update for NMF-based Voice Conversion Using anEncoder-Decoder Network
Log in to post comments

In this paper, we propose a dictionary update method for Nonnegative Matrix Factorization (NMF) with high dimensional data in a spectral conversion (SC) task. Voice conversion has been widely studied due to its potential applications such as personalized speech synthesis and speech enhancement. Exemplar-based NMF (ENMF) emerges as an effective and probably the simplest choice among all techniques for SC, as long as a source-target parallel speech corpus is given. ENMF-based SC systems usually need a large amount of bases (exemplars) to ensure the quality of the converted speech.

2016-10-20-ISCSLP-v1.0-SigPort.pptx

2016-10-20-ISCSLP-v1.0-SigPort.pptx (675)

Categories:: Speech Synthesis and Generation, including TTS (SPE-SYNT)

5 Views

Spatial Co-variation of Lip and Tongue at Strong and Weak Syllables

Read more about Spatial Co-variation of Lip and Tongue at Strong and Weak Syllables
Log in to post comments

Speech production requires control for coordination among different articulatory organs. During the natural speech, the articulatory co-variation is more common rather than compensation, but the studies supporting this view are few. In this study, the coordination of lip and tongue articulation was examined during speech using articulatory data. Native speakers of Chinese served as subjects. Speech materials consisted of short Chinese sentences, which include words having the cardinal vowels at different locations in sentences with and without emphasis.

ZJ_ISCSLP2016_kh.pdf

ISCSLP2016_POSTER (703)

Categories:: Audio and Acoustic Signal Processing

12 Views

The Examination of the Relationship between Perception and Production of Mandarin tone of Kazak Students

This study aims at examination on the relationship between the
perception and production of Mandarin tone by Kazak minor
learners from China. The eight-day perceptual training course
of Mandarin tone is designed. Perception is assessed by means
of identification test. Production data is collected both at
pretest and post-test, and evaluated by native speakers of
Mandarin Chinese. The results from the perception at pretest
and post-test reveal that training Kazak learners to perceive
Mandarin tones has been shown to be effective, with

ISCSLP168.pdf

iscslp168 (694)

Categories:: Audio and Acoustic Signal Processing

64 Views

RICH PROSODIC INFORMATION EXPLORATION ON SPONTANEOUS MANDARIN SPEECH

Read more about RICH PROSODIC INFORMATION EXPLORATION ON SPONTANEOUS MANDARIN SPEECH
Log in to post comments

In this paper, rich prosodic information of spontaneous Mandarin speech is explored. The joint prosody labeling and modeling algorithm proposed previously for read speech is extended to spontaneous-speech prosody modeling by additionally considering the modeling of disfluency speech parts. It trains a hierarchical prosodic model and performs prosody labeling from a large speech corpus automatically. Rich prosodic information is then explored via analyzing model parameters and labeling results.

42x36_ISCSLP2016_posters.pdf

42x36_ISCSLP2016_posters.pdf (759)

Categories:: Speech Analysis (SPE-ANLS)

5 Views

Tongue Shape Variation Model for Simulating Mandarin Chinese Articulation

Read more about Tongue Shape Variation Model for Simulating Mandarin Chinese Articulation
Log in to post comments

We studied tongue shapes extracted from X-ray films which were taken during the process of mandarin Chinese articulation. Through factor analysis, we built an eight-parameter-driven tongue articulation model. This study reveals that the front of the tongue has large horizontal movement; the blade of the tongue has large vertical movement; whereas the back, as well as the root, of the tongue has small movement both horizontally and vertically. This model can be used to drive a 3D tongue model to control its articulatory behavior.

Tongue Shape Variation Model.pdf

Tongue Shape Variation Model.pdf (62)

Categories:: Speech Synthesis and Generation, including TTS (SPE-SYNT)

29 Views

Pages