Sorry, you need to enable JavaScript to visit this website.

ISCSLP 2016

Welcome to ISCSLP 2016 - October 17-20, 2016, Tianjin, China

The ISCSLP will be hosted by Tianjin University. Tianjin has a reputation throughout China for being extremely friendly, safe and a place of delicious food. Welcome to Tianjin to attend the ISCSLP2016. The 10th International Symposium on Chinese Spoken Language Processing (ISCSLP 2016) will be held on October 17-20, 2016 in Tianjin. ISCSLP is a biennial conference for scientists, researchers, and practitioners to report and discuss the latest progress in all theoretical and technological aspects of spoken language processing. While the ISCSLP is focused primarily on Chinese languages, works on other languages that may be applied to Chinese speech and language are also encouraged. The working language of ISCSLP is English.

 

This paper proposes a novel speech denoising method based on tensor filtering, in which the microphone array speech signal is constructed by tensor data and processed by tensor filtering model. The multi-microphone signal is represented with three-order tensor space in the way of channel, time and frequency. Noise can be reduced by finding the lower-rank approximation of the three-order tensor with tucker model. MDL (Minimum Description Length) criterion is used to estimate the optimal tensor rank.

Categories:
7 Views

This paper describes an investigation on acoustic modeling in the absence of transcribed training data. We propose to use language-mismatched phoneme recognizers to assist unsupervised segmentation and segment clustering of a new language. Using a language-mismatched recognizer, an input utterance is divided into many variable-length segments. Each segment is represented by a feature vector that is derived from the phoneme posterior probabilities.

Categories:
9 Views

The universal speech attributes to speaker verification (SV) is addressed in this paper. The manner and place of articulation form the universal attribute unit inventory, and deep neural network (DNN) is used as acoustic model.

Categories:
3 Views

In this paper, we propose a dictionary update method for Nonnegative Matrix Factorization (NMF) with high dimensional data in a spectral conversion (SC) task. Voice conversion has been widely studied due to its potential applications such as personalized speech synthesis and speech enhancement. Exemplar-based NMF (ENMF) emerges as an effective and probably the simplest choice among all techniques for SC, as long as a source-target parallel speech corpus is given. ENMF-based SC systems usually need a large amount of bases (exemplars) to ensure the quality of the converted speech.

Categories:
3 Views

Speech production requires control for coordination among different articulatory organs. During the natural speech, the articulatory co-variation is more common rather than compensation, but the studies supporting this view are few. In this study, the coordination of lip and tongue articulation was examined during speech using articulatory data. Native speakers of Chinese served as subjects. Speech materials consisted of short Chinese sentences, which include words having the cardinal vowels at different locations in sentences with and without emphasis.

Categories:
2 Views

This study aims at examination on the relationship between the
perception and production of Mandarin tone by Kazak minor
learners from China. The eight-day perceptual training course
of Mandarin tone is designed. Perception is assessed by means
of identification test. Production data is collected both at
pretest and post-test, and evaluated by native speakers of
Mandarin Chinese. The results from the perception at pretest
and post-test reveal that training Kazak learners to perceive
Mandarin tones has been shown to be effective, with

Categories:
41 Views

In this paper, rich prosodic information of spontaneous Mandarin speech is explored. The joint prosody labeling and modeling algorithm proposed previously for read speech is extended to spontaneous-speech prosody modeling by additionally considering the modeling of disfluency speech parts. It trains a hierarchical prosodic model and performs prosody labeling from a large speech corpus automatically. Rich prosodic information is then explored via analyzing model parameters and labeling results.

Categories:
2 Views

We studied tongue shapes extracted from X-ray films which were taken during the process of mandarin Chinese articulation. Through factor analysis, we built an eight-parameter-driven tongue articulation model. This study reveals that the front of the tongue has large horizontal movement; the blade of the tongue has large vertical movement; whereas the back, as well as the root, of the tongue has small movement both horizontally and vertically. This model can be used to drive a 3D tongue model to control its articulatory behavior.

Categories:
23 Views

Pages