Sorry, you need to enable JavaScript to visit this website.

Bidirectional long short-term memory (BLSTM) recurrent neural network (RNN) has achieved state-of-the-art performance in many sequence processing problems given its capability in capturing contextual information. However, for languages with limited amount of training data, it is still difficult to obtain a high quality BLSTM model for emphasis detection, the aim of which is to recognize the emphasized speech segments from natural speech.

Categories:
6 Views

Aphasia is an acquired communication disorder resulting from brain damage and impairs an individual’s ability to use, produce, and comprehend language. Loss of communication skills can be stressful and may result in depression, yet most stress and depression diagnostic tools are designed for adults without aphasia. This project is a research effort to predict stress and depression from acoustic profiles of adults with aphasia using linear support-vector regression. The labels were obtained through caregiver surveys (SADQ-10) or surveys not designed for adults with aphasia (PSS).

Categories:
18 Views

Automatic syllable stress detection is useful in assessing and diagnosing the quality of the pronunciation of second language (L2) learners in an automated way. Typically, the syllable stress depends on three prominence measures -- intensity level, duration, pitch -- around the sound unit with the highest sonority in the respective syllable. Stress detection is often formulated as a binary classification task using cues from the feature contours representing the prominence measures.

Categories:
4 Views

Nonnegative matrix factorization (NMF) has recently been applied to temporal decomposition (TD) of speech spectral envelopes represented by line spectral frequencies. A couple of inherent TD constraints, which are otherwise handled as ad hoc exceptions, has also been incorporated using NMF, including LSF ordering and monotonic event functions. Here, these constraints are analyzed and a third inherent constraint is incorporated into an NMF analysis.

Categories:
6 Views

In this paper, rich prosodic information of spontaneous Mandarin speech is explored. The joint prosody labeling and modeling algorithm proposed previously for read speech is extended to spontaneous-speech prosody modeling by additionally considering the modeling of disfluency speech parts. It trains a hierarchical prosodic model and performs prosody labeling from a large speech corpus automatically. Rich prosodic information is then explored via analyzing model parameters and labeling results.

Categories:
1 Views

Pages