Sorry, you need to enable JavaScript to visit this website.

facebooktwittermailshare

Study of dense network approaches for speech emotion recognition

Abstract: 

Deep neural networks have been proven to be very effective in various classification problems and show great promise for emotion recognition from speech. Studies have proposed various architectures that further improve the performance of emotion recognition systems. However, there are still various open questions regarding the best approach to building a speech emotion recognition system. Would the system’s performance improve if we have more labeled data? How much do we benefit from data augmentation? What activation and regularization schemes are more beneficial? How does the depth of the network affect the performance? We are collecting the MSP-Podcast corpus, a large dataset with over 30 hours of data, which provides an ideal resource to address these questions. This study explores various dense architectures to predict arousal, valence and dominance scores. We investigate varying the training set size, width, and depth of the network, as well as the activation functions used during training. We also study the effect of data augmentation on the network’s performance. We find that bigger training set im- proves the performance. Batch normalization is crucial to achieving a good performance for deeper networks. We do not observe signif- icant differences in the performance in residual networks compared to dense networks.

up
0 users have voted:

Paper Details

Authors:
Mohammed Abdelwahab, Carlos Busso
Submitted On:
20 May 2020 - 9:56am
Short Link:
Type:
Poster
Event:
Presenter's Name:
Carlos Busso
Document Year:
2018
Cite

Document Files

Abdelwahab_2018-poster.pdf

(19)

Subscribe

[1] Mohammed Abdelwahab, Carlos Busso, "Study of dense network approaches for speech emotion recognition", IEEE SigPort, 2020. [Online]. Available: http://sigport.org/5411. Accessed: Jul. 10, 2020.
@article{5411-20,
url = {http://sigport.org/5411},
author = {Mohammed Abdelwahab; Carlos Busso },
publisher = {IEEE SigPort},
title = {Study of dense network approaches for speech emotion recognition},
year = {2020} }
TY - EJOUR
T1 - Study of dense network approaches for speech emotion recognition
AU - Mohammed Abdelwahab; Carlos Busso
PY - 2020
PB - IEEE SigPort
UR - http://sigport.org/5411
ER -
Mohammed Abdelwahab, Carlos Busso. (2020). Study of dense network approaches for speech emotion recognition. IEEE SigPort. http://sigport.org/5411
Mohammed Abdelwahab, Carlos Busso, 2020. Study of dense network approaches for speech emotion recognition. Available at: http://sigport.org/5411.
Mohammed Abdelwahab, Carlos Busso. (2020). "Study of dense network approaches for speech emotion recognition." Web.
1. Mohammed Abdelwahab, Carlos Busso. Study of dense network approaches for speech emotion recognition [Internet]. IEEE SigPort; 2020. Available from : http://sigport.org/5411