Sorry, you need to enable JavaScript to visit this website.

facebooktwittermailshare

Speech Emotion Recognition Using Deep Neural Network Considering Verbal and Nonverbal Speech Sounds

Abstract: 

Speech emotion recognition is becoming increasingly important for many applications. In real-life communication, non-verbal sounds within an utterance also play an important role for people to recognize emotion. In current studies, only few emotion recognition systems considered nonverbal sounds, such as laughter, cries or other emotion interjection, which naturally exists in our daily conversation. In this work, both verbal and nonverbal sounds within an utterance were thus considered for emotion recognition of real-life conversations. Firstly, an SVM-based verbal/nonverbal sound detector was developed. A Prosodic Phrase (PPh) auto-tagger was further employed to extract the verbal/nonverbal segments. For each segment, the emotion and sound features were respectively extracted based on convolutional neural networks (CNNs) and then concatenated to form a CNN-based generic feature vector. Finally, a sequence of CNN-based feature vectors for an entire dialog turn was fed to an attentive LSTM-based sequence-to-sequence model to output an emotional sequence as recognition result. Experimental results on the recognition of seven emotional states in the NNIME (The NTHU-NTUA Chinese interactive multimodal emotion corpus) showed that the proposed method achieved a detection accuracy of 52.00% outperforming the traditional methods.

up
0 users have voted:

Paper Details

Authors:
Kun-Yi Huang, Chung-Hsien Wu, Qian-Bei Hong, Ming-Hsiang Su and Yi-Hsuan Chen
Submitted On:
9 May 2019 - 8:26am
Short Link:
Type:
Presentation Slides
Event:
Presenter's Name:
Chung-Hsien Wu
Paper Code:
ICASSP19005
Document Year:
2019
Cite

Document Files

ICASSP2019-0509.pdf

(90)

Subscribe

[1] Kun-Yi Huang, Chung-Hsien Wu, Qian-Bei Hong, Ming-Hsiang Su and Yi-Hsuan Chen, "Speech Emotion Recognition Using Deep Neural Network Considering Verbal and Nonverbal Speech Sounds", IEEE SigPort, 2019. [Online]. Available: http://sigport.org/4197. Accessed: Feb. 27, 2020.
@article{4197-19,
url = {http://sigport.org/4197},
author = {Kun-Yi Huang; Chung-Hsien Wu; Qian-Bei Hong; Ming-Hsiang Su and Yi-Hsuan Chen },
publisher = {IEEE SigPort},
title = {Speech Emotion Recognition Using Deep Neural Network Considering Verbal and Nonverbal Speech Sounds},
year = {2019} }
TY - EJOUR
T1 - Speech Emotion Recognition Using Deep Neural Network Considering Verbal and Nonverbal Speech Sounds
AU - Kun-Yi Huang; Chung-Hsien Wu; Qian-Bei Hong; Ming-Hsiang Su and Yi-Hsuan Chen
PY - 2019
PB - IEEE SigPort
UR - http://sigport.org/4197
ER -
Kun-Yi Huang, Chung-Hsien Wu, Qian-Bei Hong, Ming-Hsiang Su and Yi-Hsuan Chen. (2019). Speech Emotion Recognition Using Deep Neural Network Considering Verbal and Nonverbal Speech Sounds. IEEE SigPort. http://sigport.org/4197
Kun-Yi Huang, Chung-Hsien Wu, Qian-Bei Hong, Ming-Hsiang Su and Yi-Hsuan Chen, 2019. Speech Emotion Recognition Using Deep Neural Network Considering Verbal and Nonverbal Speech Sounds. Available at: http://sigport.org/4197.
Kun-Yi Huang, Chung-Hsien Wu, Qian-Bei Hong, Ming-Hsiang Su and Yi-Hsuan Chen. (2019). "Speech Emotion Recognition Using Deep Neural Network Considering Verbal and Nonverbal Speech Sounds." Web.
1. Kun-Yi Huang, Chung-Hsien Wu, Qian-Bei Hong, Ming-Hsiang Su and Yi-Hsuan Chen. Speech Emotion Recognition Using Deep Neural Network Considering Verbal and Nonverbal Speech Sounds [Internet]. IEEE SigPort; 2019. Available from : http://sigport.org/4197