Sorry, you need to enable JavaScript to visit this website.

Speech Analysis (SPE-ANLS)

Similarity Metric Based on Siamese Neural Networks for Voice Casting

Paper Details

Authors:
Mathias Quillot, Richard Dufour, Vincent Labatut, Jean-François Bonastre
Submitted On:
17 May 2019 - 6:34am
Short Link:
Type:
Event:
Presenter's Name:
Document Year:
Cite

Document Files

Poster_ICASSP19.pdf

(23)

Subscribe

[1] Mathias Quillot, Richard Dufour, Vincent Labatut, Jean-François Bonastre, "Similarity Metric Based on Siamese Neural Networks for Voice Casting", IEEE SigPort, 2019. [Online]. Available: http://sigport.org/4547. Accessed: Aug. 23, 2019.
@article{4547-19,
url = {http://sigport.org/4547},
author = {Mathias Quillot; Richard Dufour; Vincent Labatut; Jean-François Bonastre },
publisher = {IEEE SigPort},
title = {Similarity Metric Based on Siamese Neural Networks for Voice Casting},
year = {2019} }
TY - EJOUR
T1 - Similarity Metric Based on Siamese Neural Networks for Voice Casting
AU - Mathias Quillot; Richard Dufour; Vincent Labatut; Jean-François Bonastre
PY - 2019
PB - IEEE SigPort
UR - http://sigport.org/4547
ER -
Mathias Quillot, Richard Dufour, Vincent Labatut, Jean-François Bonastre. (2019). Similarity Metric Based on Siamese Neural Networks for Voice Casting. IEEE SigPort. http://sigport.org/4547
Mathias Quillot, Richard Dufour, Vincent Labatut, Jean-François Bonastre, 2019. Similarity Metric Based on Siamese Neural Networks for Voice Casting. Available at: http://sigport.org/4547.
Mathias Quillot, Richard Dufour, Vincent Labatut, Jean-François Bonastre. (2019). "Similarity Metric Based on Siamese Neural Networks for Voice Casting." Web.
1. Mathias Quillot, Richard Dufour, Vincent Labatut, Jean-François Bonastre. Similarity Metric Based on Siamese Neural Networks for Voice Casting [Internet]. IEEE SigPort; 2019. Available from : http://sigport.org/4547

Dimensional Analysis of Laughter in Female Conversational Speech


How do people hear laughter in expressive, unprompted speech? What is the range of expressivity and function of laughter in this speech, and how can laughter inform the recognition of higher-level expressive dimensions in a corpus? This paper presents a scalable method for collecting natural human description of laughter, transforming the description to a vector of quantifiable laughter dimensions, and deriving baseline classifiers for the different dimensions of expressive laughter.

Paper Details

Authors:
Mary Pietrowicz, Carla Agurto, Jonah Casebeer, Mark Hasegawa-Johnson, Karrie Karahalios
Submitted On:
15 May 2019 - 2:50am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

ICASSP_Laughter_Paper_36x48_v5_final_for_printing.pdf

(25)

Subscribe

[1] Mary Pietrowicz, Carla Agurto, Jonah Casebeer, Mark Hasegawa-Johnson, Karrie Karahalios, "Dimensional Analysis of Laughter in Female Conversational Speech", IEEE SigPort, 2019. [Online]. Available: http://sigport.org/4520. Accessed: Aug. 23, 2019.
@article{4520-19,
url = {http://sigport.org/4520},
author = {Mary Pietrowicz; Carla Agurto; Jonah Casebeer; Mark Hasegawa-Johnson; Karrie Karahalios },
publisher = {IEEE SigPort},
title = {Dimensional Analysis of Laughter in Female Conversational Speech},
year = {2019} }
TY - EJOUR
T1 - Dimensional Analysis of Laughter in Female Conversational Speech
AU - Mary Pietrowicz; Carla Agurto; Jonah Casebeer; Mark Hasegawa-Johnson; Karrie Karahalios
PY - 2019
PB - IEEE SigPort
UR - http://sigport.org/4520
ER -
Mary Pietrowicz, Carla Agurto, Jonah Casebeer, Mark Hasegawa-Johnson, Karrie Karahalios. (2019). Dimensional Analysis of Laughter in Female Conversational Speech. IEEE SigPort. http://sigport.org/4520
Mary Pietrowicz, Carla Agurto, Jonah Casebeer, Mark Hasegawa-Johnson, Karrie Karahalios, 2019. Dimensional Analysis of Laughter in Female Conversational Speech. Available at: http://sigport.org/4520.
Mary Pietrowicz, Carla Agurto, Jonah Casebeer, Mark Hasegawa-Johnson, Karrie Karahalios. (2019). "Dimensional Analysis of Laughter in Female Conversational Speech." Web.
1. Mary Pietrowicz, Carla Agurto, Jonah Casebeer, Mark Hasegawa-Johnson, Karrie Karahalios. Dimensional Analysis of Laughter in Female Conversational Speech [Internet]. IEEE SigPort; 2019. Available from : http://sigport.org/4520

Speech as a Biomarker for Obstructive Sleep Apnea Detection


Obstructive sleep apnea (OSA) is a prevalent sleep disorder, responsible for a decrease of people’s quality of life, and significant morbidity and mortality associated with hypertension and cardiovascular diseases. OSA is caused by anatomical and functional alterations in the upper airways, thus we hypothesize that the speech properties of OSA patients are altered, making it possible to detect OSA through voice analysis.

Paper Details

Authors:
M. Catarina Botelho; Isabel Trancoso; Alberto Abad; Teresa Paiva
Submitted On:
10 May 2019 - 12:55pm
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

Speech as a Biomarker for Obstructive Sleep Apnea Detection - Presentation Slides

(22)

Subscribe

[1] M. Catarina Botelho; Isabel Trancoso; Alberto Abad; Teresa Paiva, "Speech as a Biomarker for Obstructive Sleep Apnea Detection", IEEE SigPort, 2019. [Online]. Available: http://sigport.org/4373. Accessed: Aug. 23, 2019.
@article{4373-19,
url = {http://sigport.org/4373},
author = {M. Catarina Botelho; Isabel Trancoso; Alberto Abad; Teresa Paiva },
publisher = {IEEE SigPort},
title = {Speech as a Biomarker for Obstructive Sleep Apnea Detection},
year = {2019} }
TY - EJOUR
T1 - Speech as a Biomarker for Obstructive Sleep Apnea Detection
AU - M. Catarina Botelho; Isabel Trancoso; Alberto Abad; Teresa Paiva
PY - 2019
PB - IEEE SigPort
UR - http://sigport.org/4373
ER -
M. Catarina Botelho; Isabel Trancoso; Alberto Abad; Teresa Paiva. (2019). Speech as a Biomarker for Obstructive Sleep Apnea Detection. IEEE SigPort. http://sigport.org/4373
M. Catarina Botelho; Isabel Trancoso; Alberto Abad; Teresa Paiva, 2019. Speech as a Biomarker for Obstructive Sleep Apnea Detection. Available at: http://sigport.org/4373.
M. Catarina Botelho; Isabel Trancoso; Alberto Abad; Teresa Paiva. (2019). "Speech as a Biomarker for Obstructive Sleep Apnea Detection." Web.
1. M. Catarina Botelho; Isabel Trancoso; Alberto Abad; Teresa Paiva. Speech as a Biomarker for Obstructive Sleep Apnea Detection [Internet]. IEEE SigPort; 2019. Available from : http://sigport.org/4373

Learning Voice Source Related Information for Depression Detection

Paper Details

Authors:
S. Pavankumar Dubagunta, Bogdan Vlasenko, Mathew Magimai Doss
Submitted On:
10 May 2019 - 10:56am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

Poster___Learning_Voice_Source.pdf

(24)

Subscribe

[1] S. Pavankumar Dubagunta, Bogdan Vlasenko, Mathew Magimai Doss, "Learning Voice Source Related Information for Depression Detection", IEEE SigPort, 2019. [Online]. Available: http://sigport.org/4355. Accessed: Aug. 23, 2019.
@article{4355-19,
url = {http://sigport.org/4355},
author = {S. Pavankumar Dubagunta; Bogdan Vlasenko; Mathew Magimai Doss },
publisher = {IEEE SigPort},
title = {Learning Voice Source Related Information for Depression Detection},
year = {2019} }
TY - EJOUR
T1 - Learning Voice Source Related Information for Depression Detection
AU - S. Pavankumar Dubagunta; Bogdan Vlasenko; Mathew Magimai Doss
PY - 2019
PB - IEEE SigPort
UR - http://sigport.org/4355
ER -
S. Pavankumar Dubagunta, Bogdan Vlasenko, Mathew Magimai Doss. (2019). Learning Voice Source Related Information for Depression Detection. IEEE SigPort. http://sigport.org/4355
S. Pavankumar Dubagunta, Bogdan Vlasenko, Mathew Magimai Doss, 2019. Learning Voice Source Related Information for Depression Detection. Available at: http://sigport.org/4355.
S. Pavankumar Dubagunta, Bogdan Vlasenko, Mathew Magimai Doss. (2019). "Learning Voice Source Related Information for Depression Detection." Web.
1. S. Pavankumar Dubagunta, Bogdan Vlasenko, Mathew Magimai Doss. Learning Voice Source Related Information for Depression Detection [Internet]. IEEE SigPort; 2019. Available from : http://sigport.org/4355

ICASSP 2019 Poster - Privacy-preserving Paralinguistic Tasks


Speech is one of the primary means of communication for humans. It can be viewed as a carrier for information on several levels as it conveys not only the meaning and intention predetermined by a speaker, but also paralinguistic and extralinguistic information about the speaker’s age, gender, personality, emotional state, health state and affect. This makes it a particularly sensitive biometric, that should be protected.

Paper Details

Authors:
Francisco Teixeira, Alberto Abad, Isabel Trancoso
Submitted On:
10 May 2019 - 10:21am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

Privacy-preserving Paralinguistic Tasks - Poster

(32)

Subscribe

[1] Francisco Teixeira, Alberto Abad, Isabel Trancoso, "ICASSP 2019 Poster - Privacy-preserving Paralinguistic Tasks", IEEE SigPort, 2019. [Online]. Available: http://sigport.org/4336. Accessed: Aug. 23, 2019.
@article{4336-19,
url = {http://sigport.org/4336},
author = {Francisco Teixeira; Alberto Abad; Isabel Trancoso },
publisher = {IEEE SigPort},
title = {ICASSP 2019 Poster - Privacy-preserving Paralinguistic Tasks},
year = {2019} }
TY - EJOUR
T1 - ICASSP 2019 Poster - Privacy-preserving Paralinguistic Tasks
AU - Francisco Teixeira; Alberto Abad; Isabel Trancoso
PY - 2019
PB - IEEE SigPort
UR - http://sigport.org/4336
ER -
Francisco Teixeira, Alberto Abad, Isabel Trancoso. (2019). ICASSP 2019 Poster - Privacy-preserving Paralinguistic Tasks. IEEE SigPort. http://sigport.org/4336
Francisco Teixeira, Alberto Abad, Isabel Trancoso, 2019. ICASSP 2019 Poster - Privacy-preserving Paralinguistic Tasks. Available at: http://sigport.org/4336.
Francisco Teixeira, Alberto Abad, Isabel Trancoso. (2019). "ICASSP 2019 Poster - Privacy-preserving Paralinguistic Tasks." Web.
1. Francisco Teixeira, Alberto Abad, Isabel Trancoso. ICASSP 2019 Poster - Privacy-preserving Paralinguistic Tasks [Internet]. IEEE SigPort; 2019. Available from : http://sigport.org/4336

Privacy-preserving Paralinguistic Tasks


Speech is one of the primary means of communication for humans. It can be viewed as a carrier for information on several levels as it conveys not only the meaning and intention predetermined by a speaker, but also paralinguistic and extralinguistic information about the speaker’s age, gender, personality, emotional state, health state and affect. This makes it a particularly sensitive biometric, that should be protected.

Paper Details

Authors:
Alberto Abad, Isabel Trancoso
Submitted On:
10 May 2019 - 9:54am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

Privacy-preserving Paralinguistic Tasks Poster

(32)

Subscribe

[1] Alberto Abad, Isabel Trancoso, "Privacy-preserving Paralinguistic Tasks", IEEE SigPort, 2019. [Online]. Available: http://sigport.org/4334. Accessed: Aug. 23, 2019.
@article{4334-19,
url = {http://sigport.org/4334},
author = {Alberto Abad; Isabel Trancoso },
publisher = {IEEE SigPort},
title = {Privacy-preserving Paralinguistic Tasks},
year = {2019} }
TY - EJOUR
T1 - Privacy-preserving Paralinguistic Tasks
AU - Alberto Abad; Isabel Trancoso
PY - 2019
PB - IEEE SigPort
UR - http://sigport.org/4334
ER -
Alberto Abad, Isabel Trancoso. (2019). Privacy-preserving Paralinguistic Tasks. IEEE SigPort. http://sigport.org/4334
Alberto Abad, Isabel Trancoso, 2019. Privacy-preserving Paralinguistic Tasks. Available at: http://sigport.org/4334.
Alberto Abad, Isabel Trancoso. (2019). "Privacy-preserving Paralinguistic Tasks." Web.
1. Alberto Abad, Isabel Trancoso. Privacy-preserving Paralinguistic Tasks [Internet]. IEEE SigPort; 2019. Available from : http://sigport.org/4334

A Study on how Pre-Whitening Influences Fundamental Frequency Estimation


This paper deals with the influence of pre-whitening for the task of fundamental frequency estimation in noisy conditions. Parametric fundamental frequency estimators commonly assume that the noise is white and Gaussian and, therefore, they are only statistically efficient under those conditions. The noise is coloured in many practical applications and this will often result in problems of misidentifying an integer divisor or multiple of the true fundamental frequency (i.e., octave errors).

Paper Details

Authors:
Alfredo Esquivel Jaramillo, Jesper Kjær Nielsen, and Mads Græsbøll Christensen
Submitted On:
8 May 2019 - 10:29am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

fundamental frequency

(36)

Subscribe

[1] Alfredo Esquivel Jaramillo, Jesper Kjær Nielsen, and Mads Græsbøll Christensen, " A Study on how Pre-Whitening Influences Fundamental Frequency Estimation", IEEE SigPort, 2019. [Online]. Available: http://sigport.org/4116. Accessed: Aug. 23, 2019.
@article{4116-19,
url = {http://sigport.org/4116},
author = {Alfredo Esquivel Jaramillo; Jesper Kjær Nielsen; and Mads Græsbøll Christensen },
publisher = {IEEE SigPort},
title = { A Study on how Pre-Whitening Influences Fundamental Frequency Estimation},
year = {2019} }
TY - EJOUR
T1 - A Study on how Pre-Whitening Influences Fundamental Frequency Estimation
AU - Alfredo Esquivel Jaramillo; Jesper Kjær Nielsen; and Mads Græsbøll Christensen
PY - 2019
PB - IEEE SigPort
UR - http://sigport.org/4116
ER -
Alfredo Esquivel Jaramillo, Jesper Kjær Nielsen, and Mads Græsbøll Christensen. (2019). A Study on how Pre-Whitening Influences Fundamental Frequency Estimation. IEEE SigPort. http://sigport.org/4116
Alfredo Esquivel Jaramillo, Jesper Kjær Nielsen, and Mads Græsbøll Christensen, 2019. A Study on how Pre-Whitening Influences Fundamental Frequency Estimation. Available at: http://sigport.org/4116.
Alfredo Esquivel Jaramillo, Jesper Kjær Nielsen, and Mads Græsbøll Christensen. (2019). " A Study on how Pre-Whitening Influences Fundamental Frequency Estimation." Web.
1. Alfredo Esquivel Jaramillo, Jesper Kjær Nielsen, and Mads Græsbøll Christensen. A Study on how Pre-Whitening Influences Fundamental Frequency Estimation [Internet]. IEEE SigPort; 2019. Available from : http://sigport.org/4116

AN INTERACTION-AWARE ATTENTION NETWORK FOR SPEECH EMOTION RECOGNITION IN SPOKEN DIALOGS


Obtaining robust speech emotion recognition (SER) in scenarios of spoken interactions is critical to the developments of next generation human-machine interface. Previous research has largely focused on performing SER by modeling each utterance of the dialog in isolation without considering the transactional and dependent nature of the human-human conversation. In this work, we propose an interaction-aware attention network (IAAN) that incorporate contextual information in the learned vocal representation through a novel attention mechanism.

Paper Details

Authors:
Sung-Lin Yeh, Yun-Shao Lin, Chi-Chun Lee
Submitted On:
9 May 2019 - 11:36am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

ICASSP2019_poster_interaction.pdf

(22)

AN INTERACTION-AWARE ATTENTION NETWORK FOR SPEECH EMOTION RECOGNITION IN SPOKEN DIALOGS.pdf

(17)

Subscribe

[1] Sung-Lin Yeh, Yun-Shao Lin, Chi-Chun Lee, "AN INTERACTION-AWARE ATTENTION NETWORK FOR SPEECH EMOTION RECOGNITION IN SPOKEN DIALOGS", IEEE SigPort, 2019. [Online]. Available: http://sigport.org/4097. Accessed: Aug. 23, 2019.
@article{4097-19,
url = {http://sigport.org/4097},
author = {Sung-Lin Yeh; Yun-Shao Lin; Chi-Chun Lee },
publisher = {IEEE SigPort},
title = {AN INTERACTION-AWARE ATTENTION NETWORK FOR SPEECH EMOTION RECOGNITION IN SPOKEN DIALOGS},
year = {2019} }
TY - EJOUR
T1 - AN INTERACTION-AWARE ATTENTION NETWORK FOR SPEECH EMOTION RECOGNITION IN SPOKEN DIALOGS
AU - Sung-Lin Yeh; Yun-Shao Lin; Chi-Chun Lee
PY - 2019
PB - IEEE SigPort
UR - http://sigport.org/4097
ER -
Sung-Lin Yeh, Yun-Shao Lin, Chi-Chun Lee. (2019). AN INTERACTION-AWARE ATTENTION NETWORK FOR SPEECH EMOTION RECOGNITION IN SPOKEN DIALOGS. IEEE SigPort. http://sigport.org/4097
Sung-Lin Yeh, Yun-Shao Lin, Chi-Chun Lee, 2019. AN INTERACTION-AWARE ATTENTION NETWORK FOR SPEECH EMOTION RECOGNITION IN SPOKEN DIALOGS. Available at: http://sigport.org/4097.
Sung-Lin Yeh, Yun-Shao Lin, Chi-Chun Lee. (2019). "AN INTERACTION-AWARE ATTENTION NETWORK FOR SPEECH EMOTION RECOGNITION IN SPOKEN DIALOGS." Web.
1. Sung-Lin Yeh, Yun-Shao Lin, Chi-Chun Lee. AN INTERACTION-AWARE ATTENTION NETWORK FOR SPEECH EMOTION RECOGNITION IN SPOKEN DIALOGS [Internet]. IEEE SigPort; 2019. Available from : http://sigport.org/4097

Improving Speech Emotion Recognition with Unsupervised Representation Learning on Unlabeled Speech

Paper Details

Authors:
Submitted On:
8 May 2019 - 3:55am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

Improving_SER_with_RL.pdf

(32)

Subscribe

[1] , "Improving Speech Emotion Recognition with Unsupervised Representation Learning on Unlabeled Speech", IEEE SigPort, 2019. [Online]. Available: http://sigport.org/4041. Accessed: Aug. 23, 2019.
@article{4041-19,
url = {http://sigport.org/4041},
author = { },
publisher = {IEEE SigPort},
title = {Improving Speech Emotion Recognition with Unsupervised Representation Learning on Unlabeled Speech},
year = {2019} }
TY - EJOUR
T1 - Improving Speech Emotion Recognition with Unsupervised Representation Learning on Unlabeled Speech
AU -
PY - 2019
PB - IEEE SigPort
UR - http://sigport.org/4041
ER -
. (2019). Improving Speech Emotion Recognition with Unsupervised Representation Learning on Unlabeled Speech. IEEE SigPort. http://sigport.org/4041
, 2019. Improving Speech Emotion Recognition with Unsupervised Representation Learning on Unlabeled Speech. Available at: http://sigport.org/4041.
. (2019). "Improving Speech Emotion Recognition with Unsupervised Representation Learning on Unlabeled Speech." Web.
1. . Improving Speech Emotion Recognition with Unsupervised Representation Learning on Unlabeled Speech [Internet]. IEEE SigPort; 2019. Available from : http://sigport.org/4041

A DEEP NEURAL NETWORK BASED END TO END MODEL FOR JOINT HEIGHT AND AGE ESTIMATION FROM SHORT DURATION SPEECH


Automatic height and age prediction of a speaker has a wide variety of applications in speaker profiling, forensics etc. Often in such applications only a few seconds of speech data is available to reliably estimate the speaker parameters. Traditionally, age and height were predicted separately using different estimation algorithms. In this work, we propose a unified DNN architecture to predict both height and age of a speaker for short durations of speech.

Paper Details

Authors:
Shareef Babu Kalluri, Deepu Vijayasenan, Sriram Ganapathy
Submitted On:
8 May 2019 - 1:55am
Short Link:
Type:
Event:
Paper Code:
Document Year:
Cite

Document Files

ICASSP poster

(202)

Subscribe

[1] Shareef Babu Kalluri, Deepu Vijayasenan, Sriram Ganapathy, "A DEEP NEURAL NETWORK BASED END TO END MODEL FOR JOINT HEIGHT AND AGE ESTIMATION FROM SHORT DURATION SPEECH", IEEE SigPort, 2019. [Online]. Available: http://sigport.org/4014. Accessed: Aug. 23, 2019.
@article{4014-19,
url = {http://sigport.org/4014},
author = {Shareef Babu Kalluri; Deepu Vijayasenan; Sriram Ganapathy },
publisher = {IEEE SigPort},
title = {A DEEP NEURAL NETWORK BASED END TO END MODEL FOR JOINT HEIGHT AND AGE ESTIMATION FROM SHORT DURATION SPEECH},
year = {2019} }
TY - EJOUR
T1 - A DEEP NEURAL NETWORK BASED END TO END MODEL FOR JOINT HEIGHT AND AGE ESTIMATION FROM SHORT DURATION SPEECH
AU - Shareef Babu Kalluri; Deepu Vijayasenan; Sriram Ganapathy
PY - 2019
PB - IEEE SigPort
UR - http://sigport.org/4014
ER -
Shareef Babu Kalluri, Deepu Vijayasenan, Sriram Ganapathy. (2019). A DEEP NEURAL NETWORK BASED END TO END MODEL FOR JOINT HEIGHT AND AGE ESTIMATION FROM SHORT DURATION SPEECH. IEEE SigPort. http://sigport.org/4014
Shareef Babu Kalluri, Deepu Vijayasenan, Sriram Ganapathy, 2019. A DEEP NEURAL NETWORK BASED END TO END MODEL FOR JOINT HEIGHT AND AGE ESTIMATION FROM SHORT DURATION SPEECH. Available at: http://sigport.org/4014.
Shareef Babu Kalluri, Deepu Vijayasenan, Sriram Ganapathy. (2019). "A DEEP NEURAL NETWORK BASED END TO END MODEL FOR JOINT HEIGHT AND AGE ESTIMATION FROM SHORT DURATION SPEECH." Web.
1. Shareef Babu Kalluri, Deepu Vijayasenan, Sriram Ganapathy. A DEEP NEURAL NETWORK BASED END TO END MODEL FOR JOINT HEIGHT AND AGE ESTIMATION FROM SHORT DURATION SPEECH [Internet]. IEEE SigPort; 2019. Available from : http://sigport.org/4014

Pages