Sorry, you need to enable JavaScript to visit this website.

Speech Processing

A PLLR and Multi-stage Staircase Regression Framework for Speech-based Emotion Prediction


Continuous prediction of dimensional emotions (e.g. arousal and valence) has attracted increasing research interest recently. When processing emotional speech signals, phonetic features have been rarely used due to the assumption that phonetic variability is a confounding factor that degrades emotion recognition/prediction performance. In this paper, instead of eliminating phonetic variability, we investigated whether Phone Log-Likelihood Ratio (PLLR) features could be used to index arousal and valence in a pairwise low/high framework.

Paper Details

Authors:
Zhaocheng Huang, Julien Epps
Submitted On:
17 March 2017 - 10:17pm
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

DAVID_ICASSP2017_V1.pdf

(88 downloads)

Keywords

Subscribe

[1] Zhaocheng Huang, Julien Epps, "A PLLR and Multi-stage Staircase Regression Framework for Speech-based Emotion Prediction", IEEE SigPort, 2017. [Online]. Available: http://sigport.org/1775. Accessed: Oct. 23, 2017.
@article{1775-17,
url = {http://sigport.org/1775},
author = {Zhaocheng Huang; Julien Epps },
publisher = {IEEE SigPort},
title = {A PLLR and Multi-stage Staircase Regression Framework for Speech-based Emotion Prediction},
year = {2017} }
TY - EJOUR
T1 - A PLLR and Multi-stage Staircase Regression Framework for Speech-based Emotion Prediction
AU - Zhaocheng Huang; Julien Epps
PY - 2017
PB - IEEE SigPort
UR - http://sigport.org/1775
ER -
Zhaocheng Huang, Julien Epps. (2017). A PLLR and Multi-stage Staircase Regression Framework for Speech-based Emotion Prediction. IEEE SigPort. http://sigport.org/1775
Zhaocheng Huang, Julien Epps, 2017. A PLLR and Multi-stage Staircase Regression Framework for Speech-based Emotion Prediction. Available at: http://sigport.org/1775.
Zhaocheng Huang, Julien Epps. (2017). "A PLLR and Multi-stage Staircase Regression Framework for Speech-based Emotion Prediction." Web.
1. Zhaocheng Huang, Julien Epps. A PLLR and Multi-stage Staircase Regression Framework for Speech-based Emotion Prediction [Internet]. IEEE SigPort; 2017. Available from : http://sigport.org/1775

DNN APPROACH TO SPEAKER DIARISATION USING SPEAKER CHANNELS

Paper Details

Authors:
Rosanna Milner, Thomas Hain
Submitted On:
11 March 2017 - 8:47pm
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

talk-dia-icassp17-milner.pdf

(66 downloads)

Keywords

Additional Categories

Subscribe

[1] Rosanna Milner, Thomas Hain, "DNN APPROACH TO SPEAKER DIARISATION USING SPEAKER CHANNELS", IEEE SigPort, 2017. [Online]. Available: http://sigport.org/1678. Accessed: Oct. 23, 2017.
@article{1678-17,
url = {http://sigport.org/1678},
author = {Rosanna Milner; Thomas Hain },
publisher = {IEEE SigPort},
title = {DNN APPROACH TO SPEAKER DIARISATION USING SPEAKER CHANNELS},
year = {2017} }
TY - EJOUR
T1 - DNN APPROACH TO SPEAKER DIARISATION USING SPEAKER CHANNELS
AU - Rosanna Milner; Thomas Hain
PY - 2017
PB - IEEE SigPort
UR - http://sigport.org/1678
ER -
Rosanna Milner, Thomas Hain. (2017). DNN APPROACH TO SPEAKER DIARISATION USING SPEAKER CHANNELS. IEEE SigPort. http://sigport.org/1678
Rosanna Milner, Thomas Hain, 2017. DNN APPROACH TO SPEAKER DIARISATION USING SPEAKER CHANNELS. Available at: http://sigport.org/1678.
Rosanna Milner, Thomas Hain. (2017). "DNN APPROACH TO SPEAKER DIARISATION USING SPEAKER CHANNELS." Web.
1. Rosanna Milner, Thomas Hain. DNN APPROACH TO SPEAKER DIARISATION USING SPEAKER CHANNELS [Internet]. IEEE SigPort; 2017. Available from : http://sigport.org/1678

Automatic Speech Emotion Recognition Using Recurrent Neural Networks with Local Attention


Automatic emotion recognition from speech is a challenging task which relies heavily on the effectiveness of the speech features used for classification. In this work, we study the use of deep learning to automatically discover emotionally relevant features from speech. It is shown that using a deep recurrent neural network, we can learn both the short-time frame-level acoustic features that are emotionally relevant, as well as an appropriate temporal aggregation of those features into a compact utterance-level representation.

Paper Details

Authors:
Emad Barsoum, Cha Zhang
Submitted On:
15 March 2017 - 12:33am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

icassp2017.pptx

(91 downloads)

icassp2017.pdf

(180 downloads)

Keywords

Subscribe

[1] Emad Barsoum, Cha Zhang, "Automatic Speech Emotion Recognition Using Recurrent Neural Networks with Local Attention", IEEE SigPort, 2017. [Online]. Available: http://sigport.org/1667. Accessed: Oct. 23, 2017.
@article{1667-17,
url = {http://sigport.org/1667},
author = {Emad Barsoum; Cha Zhang },
publisher = {IEEE SigPort},
title = {Automatic Speech Emotion Recognition Using Recurrent Neural Networks with Local Attention},
year = {2017} }
TY - EJOUR
T1 - Automatic Speech Emotion Recognition Using Recurrent Neural Networks with Local Attention
AU - Emad Barsoum; Cha Zhang
PY - 2017
PB - IEEE SigPort
UR - http://sigport.org/1667
ER -
Emad Barsoum, Cha Zhang. (2017). Automatic Speech Emotion Recognition Using Recurrent Neural Networks with Local Attention. IEEE SigPort. http://sigport.org/1667
Emad Barsoum, Cha Zhang, 2017. Automatic Speech Emotion Recognition Using Recurrent Neural Networks with Local Attention. Available at: http://sigport.org/1667.
Emad Barsoum, Cha Zhang. (2017). "Automatic Speech Emotion Recognition Using Recurrent Neural Networks with Local Attention." Web.
1. Emad Barsoum, Cha Zhang. Automatic Speech Emotion Recognition Using Recurrent Neural Networks with Local Attention [Internet]. IEEE SigPort; 2017. Available from : http://sigport.org/1667

ICASSP_FCDNNBSS_poster

Paper Details

Authors:
Submitted On:
2 March 2017 - 2:18pm
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

ICASSP2017_poster.pdf

(84 downloads)

Keywords

Subscribe

[1] , "ICASSP_FCDNNBSS_poster", IEEE SigPort, 2017. [Online]. Available: http://sigport.org/1590. Accessed: Oct. 23, 2017.
@article{1590-17,
url = {http://sigport.org/1590},
author = { },
publisher = {IEEE SigPort},
title = {ICASSP_FCDNNBSS_poster},
year = {2017} }
TY - EJOUR
T1 - ICASSP_FCDNNBSS_poster
AU -
PY - 2017
PB - IEEE SigPort
UR - http://sigport.org/1590
ER -
. (2017). ICASSP_FCDNNBSS_poster. IEEE SigPort. http://sigport.org/1590
, 2017. ICASSP_FCDNNBSS_poster. Available at: http://sigport.org/1590.
. (2017). "ICASSP_FCDNNBSS_poster." Web.
1. . ICASSP_FCDNNBSS_poster [Internet]. IEEE SigPort; 2017. Available from : http://sigport.org/1590

An Initial Study of Indonesian Semantic Role Labeling and Its Application on Event Extraction


Semantic role labeling (SRL) is a task to as- sign semantic role labels to sentence elements. This pa- per describes the initial development of an Indonesian semantic role labeling system and its application to extract event information from Tweets. We compare two feature types when designing the SRL systems: Word-to-Word and Phrase-to-Phrase. Our experiments showed that the Word- to-Word feature approach outperforms the Phrase-to-Phrase approach. The application of the SRL system to an event extraction problem resulted overlap-based accuracy of 0.94 for the actor identification.

Paper Details

Authors:
Ayu Purwarianti, Lisa Madlberger
Submitted On:
21 November 2016 - 10:37pm
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

presentation_IALP2016_Ade.pdf

(128 downloads)

Keywords

Additional Categories

Subscribe

[1] Ayu Purwarianti, Lisa Madlberger, "An Initial Study of Indonesian Semantic Role Labeling and Its Application on Event Extraction ", IEEE SigPort, 2016. [Online]. Available: http://sigport.org/1289. Accessed: Oct. 23, 2017.
@article{1289-16,
url = {http://sigport.org/1289},
author = {Ayu Purwarianti; Lisa Madlberger },
publisher = {IEEE SigPort},
title = {An Initial Study of Indonesian Semantic Role Labeling and Its Application on Event Extraction },
year = {2016} }
TY - EJOUR
T1 - An Initial Study of Indonesian Semantic Role Labeling and Its Application on Event Extraction
AU - Ayu Purwarianti; Lisa Madlberger
PY - 2016
PB - IEEE SigPort
UR - http://sigport.org/1289
ER -
Ayu Purwarianti, Lisa Madlberger. (2016). An Initial Study of Indonesian Semantic Role Labeling and Its Application on Event Extraction . IEEE SigPort. http://sigport.org/1289
Ayu Purwarianti, Lisa Madlberger, 2016. An Initial Study of Indonesian Semantic Role Labeling and Its Application on Event Extraction . Available at: http://sigport.org/1289.
Ayu Purwarianti, Lisa Madlberger. (2016). "An Initial Study of Indonesian Semantic Role Labeling and Its Application on Event Extraction ." Web.
1. Ayu Purwarianti, Lisa Madlberger. An Initial Study of Indonesian Semantic Role Labeling and Its Application on Event Extraction [Internet]. IEEE SigPort; 2016. Available from : http://sigport.org/1289

An Investigation of Adaptation Techniques for Building Acoustic Models for Hearing-impaired Children in a CAPT Application

Paper Details

Authors:
Yingke Zhu, Brian Mak
Submitted On:
20 October 2016 - 12:55am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

presentation slides

(158 downloads)

Keywords

Additional Categories

Subscribe

[1] Yingke Zhu, Brian Mak, "An Investigation of Adaptation Techniques for Building Acoustic Models for Hearing-impaired Children in a CAPT Application", IEEE SigPort, 2016. [Online]. Available: http://sigport.org/1260. Accessed: Oct. 23, 2017.
@article{1260-16,
url = {http://sigport.org/1260},
author = {Yingke Zhu; Brian Mak },
publisher = {IEEE SigPort},
title = {An Investigation of Adaptation Techniques for Building Acoustic Models for Hearing-impaired Children in a CAPT Application},
year = {2016} }
TY - EJOUR
T1 - An Investigation of Adaptation Techniques for Building Acoustic Models for Hearing-impaired Children in a CAPT Application
AU - Yingke Zhu; Brian Mak
PY - 2016
PB - IEEE SigPort
UR - http://sigport.org/1260
ER -
Yingke Zhu, Brian Mak. (2016). An Investigation of Adaptation Techniques for Building Acoustic Models for Hearing-impaired Children in a CAPT Application. IEEE SigPort. http://sigport.org/1260
Yingke Zhu, Brian Mak, 2016. An Investigation of Adaptation Techniques for Building Acoustic Models for Hearing-impaired Children in a CAPT Application. Available at: http://sigport.org/1260.
Yingke Zhu, Brian Mak. (2016). "An Investigation of Adaptation Techniques for Building Acoustic Models for Hearing-impaired Children in a CAPT Application." Web.
1. Yingke Zhu, Brian Mak. An Investigation of Adaptation Techniques for Building Acoustic Models for Hearing-impaired Children in a CAPT Application [Internet]. IEEE SigPort; 2016. Available from : http://sigport.org/1260

Speaker Diarization System for Autism Children’s Real-Life Audio Data


167.pdf

PDF icon 167.pdf (149 downloads)

Paper Details

Authors:
Submitted On:
15 October 2016 - 12:39pm
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

167.pdf

(149 downloads)

Keywords

Subscribe

[1] , "Speaker Diarization System for Autism Children’s Real-Life Audio Data", IEEE SigPort, 2016. [Online]. Available: http://sigport.org/1251. Accessed: Oct. 23, 2017.
@article{1251-16,
url = {http://sigport.org/1251},
author = { },
publisher = {IEEE SigPort},
title = {Speaker Diarization System for Autism Children’s Real-Life Audio Data},
year = {2016} }
TY - EJOUR
T1 - Speaker Diarization System for Autism Children’s Real-Life Audio Data
AU -
PY - 2016
PB - IEEE SigPort
UR - http://sigport.org/1251
ER -
. (2016). Speaker Diarization System for Autism Children’s Real-Life Audio Data. IEEE SigPort. http://sigport.org/1251
, 2016. Speaker Diarization System for Autism Children’s Real-Life Audio Data. Available at: http://sigport.org/1251.
. (2016). "Speaker Diarization System for Autism Children’s Real-Life Audio Data." Web.
1. . Speaker Diarization System for Autism Children’s Real-Life Audio Data [Internet]. IEEE SigPort; 2016. Available from : http://sigport.org/1251

Perceptual Evaluation of Natural and Synthesized Speech with Prosodic Focus in Mandarin Production of American Learners


Natural and synthesized speech in L2 Mandarin produced by American English learners was evaluated by native Mandarin speakers to identify focus status and rate the naturalness of the speech. The results reveal that natural speech was recognized and rated better than synthesized speech, early learners’ speech better than late learners’ speech, focused sentences better than no-focus sentences, and initial focus and medial focus better than final focus. Tones of in-focus words interacted with focus status of the sentence and speaker group.

Paper Details

Authors:
Ying Chen, Li Liu, Xueqin Zhao
Submitted On:
14 October 2016 - 1:50pm
Short Link:
Type:
Event:

Document Files

ChenEtAl._ISCSLP2016_poster.pdf

(142 downloads)

Keywords

Subscribe

[1] Ying Chen, Li Liu, Xueqin Zhao, "Perceptual Evaluation of Natural and Synthesized Speech with Prosodic Focus in Mandarin Production of American Learners", IEEE SigPort, 2016. [Online]. Available: http://sigport.org/1211. Accessed: Oct. 23, 2017.
@article{1211-16,
url = {http://sigport.org/1211},
author = {Ying Chen; Li Liu; Xueqin Zhao },
publisher = {IEEE SigPort},
title = {Perceptual Evaluation of Natural and Synthesized Speech with Prosodic Focus in Mandarin Production of American Learners},
year = {2016} }
TY - EJOUR
T1 - Perceptual Evaluation of Natural and Synthesized Speech with Prosodic Focus in Mandarin Production of American Learners
AU - Ying Chen; Li Liu; Xueqin Zhao
PY - 2016
PB - IEEE SigPort
UR - http://sigport.org/1211
ER -
Ying Chen, Li Liu, Xueqin Zhao. (2016). Perceptual Evaluation of Natural and Synthesized Speech with Prosodic Focus in Mandarin Production of American Learners. IEEE SigPort. http://sigport.org/1211
Ying Chen, Li Liu, Xueqin Zhao, 2016. Perceptual Evaluation of Natural and Synthesized Speech with Prosodic Focus in Mandarin Production of American Learners. Available at: http://sigport.org/1211.
Ying Chen, Li Liu, Xueqin Zhao. (2016). "Perceptual Evaluation of Natural and Synthesized Speech with Prosodic Focus in Mandarin Production of American Learners." Web.
1. Ying Chen, Li Liu, Xueqin Zhao. Perceptual Evaluation of Natural and Synthesized Speech with Prosodic Focus in Mandarin Production of American Learners [Internet]. IEEE SigPort; 2016. Available from : http://sigport.org/1211

Towards Automatic Assessment of Aphasia Speech Using Automatic Speech Recognition Techniques


Aphasia is a type of acquired language impairment caused by brain injury. This paper presents an automatic speech recog- nition (ASR) based approach to objective assessment of apha- sia patients. A dedicated ASR system is developed to facilitate acoustical and linguistic analysis of Cantonese aphasia speech. The acoustic models and the language models are trained with domain- and style-matched speech data from unimpaired con- trol speakers. The speech recognition performance of this sys- tem is evaluated on natural oral discourses from patients with various types of aphasia.

Paper Details

Authors:
Ying Qin, Tan Lee, Anthony Pak Hin Kong, Sam Po Law
Submitted On:
14 October 2016 - 5:51am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

conference-10.18.pdf

(151 downloads)

Keywords

Subscribe

[1] Ying Qin, Tan Lee, Anthony Pak Hin Kong, Sam Po Law, "Towards Automatic Assessment of Aphasia Speech Using Automatic Speech Recognition Techniques", IEEE SigPort, 2016. [Online]. Available: http://sigport.org/1187. Accessed: Oct. 23, 2017.
@article{1187-16,
url = {http://sigport.org/1187},
author = {Ying Qin; Tan Lee; Anthony Pak Hin Kong; Sam Po Law },
publisher = {IEEE SigPort},
title = {Towards Automatic Assessment of Aphasia Speech Using Automatic Speech Recognition Techniques},
year = {2016} }
TY - EJOUR
T1 - Towards Automatic Assessment of Aphasia Speech Using Automatic Speech Recognition Techniques
AU - Ying Qin; Tan Lee; Anthony Pak Hin Kong; Sam Po Law
PY - 2016
PB - IEEE SigPort
UR - http://sigport.org/1187
ER -
Ying Qin, Tan Lee, Anthony Pak Hin Kong, Sam Po Law. (2016). Towards Automatic Assessment of Aphasia Speech Using Automatic Speech Recognition Techniques. IEEE SigPort. http://sigport.org/1187
Ying Qin, Tan Lee, Anthony Pak Hin Kong, Sam Po Law, 2016. Towards Automatic Assessment of Aphasia Speech Using Automatic Speech Recognition Techniques. Available at: http://sigport.org/1187.
Ying Qin, Tan Lee, Anthony Pak Hin Kong, Sam Po Law. (2016). "Towards Automatic Assessment of Aphasia Speech Using Automatic Speech Recognition Techniques." Web.
1. Ying Qin, Tan Lee, Anthony Pak Hin Kong, Sam Po Law. Towards Automatic Assessment of Aphasia Speech Using Automatic Speech Recognition Techniques [Internet]. IEEE SigPort; 2016. Available from : http://sigport.org/1187

Poster for Nonstationary Blind Super-resolution

Paper Details

Authors:
Dehui Yang, Gongguo Tang, Michael Wakin
Submitted On:
30 March 2016 - 3:34am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

ICASSP_poster_with_reference.pdf

(236 downloads)

Keywords

Subscribe

[1] Dehui Yang, Gongguo Tang, Michael Wakin, "Poster for Nonstationary Blind Super-resolution", IEEE SigPort, 2016. [Online]. Available: http://sigport.org/1070. Accessed: Oct. 23, 2017.
@article{1070-16,
url = {http://sigport.org/1070},
author = {Dehui Yang; Gongguo Tang; Michael Wakin },
publisher = {IEEE SigPort},
title = {Poster for Nonstationary Blind Super-resolution},
year = {2016} }
TY - EJOUR
T1 - Poster for Nonstationary Blind Super-resolution
AU - Dehui Yang; Gongguo Tang; Michael Wakin
PY - 2016
PB - IEEE SigPort
UR - http://sigport.org/1070
ER -
Dehui Yang, Gongguo Tang, Michael Wakin. (2016). Poster for Nonstationary Blind Super-resolution. IEEE SigPort. http://sigport.org/1070
Dehui Yang, Gongguo Tang, Michael Wakin, 2016. Poster for Nonstationary Blind Super-resolution. Available at: http://sigport.org/1070.
Dehui Yang, Gongguo Tang, Michael Wakin. (2016). "Poster for Nonstationary Blind Super-resolution." Web.
1. Dehui Yang, Gongguo Tang, Michael Wakin. Poster for Nonstationary Blind Super-resolution [Internet]. IEEE SigPort; 2016. Available from : http://sigport.org/1070

Pages