Sorry, you need to enable JavaScript to visit this website.

ICASSP 2019

ICASSP is the world’s largest and most comprehensive technical conference focused on signal processing and its applications. The 2019 conference will feature world-class presentations by internationally renowned speakers, cutting-edge session topics and provide a fantastic opportunity to network with like-minded professionals from around the world. Visit website

MUSIC BOUNDARY DETECTION BASED ON A HYBRID DEEP MODEL OF NOVELTY, HOMOGENEITY, REPETITION AND DURATION


Current state-of-the-art music boundary detection methods use local features for boundary detection, but such an approach fails to explicitly incorporate the statistical properties of the detected segments. This paper presents a music boundary detection method that simultaneously considers a fitness measure based on the boundary posterior probability, the likelihood of the segmentation duration sequence, and the acoustic consistency within a segment.

Paper Details

Authors:
Akira Maezawa
Submitted On:
15 May 2019 - 11:50am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

ICASSP2019-maezawa.pdf

(34)

Subscribe

[1] Akira Maezawa, "MUSIC BOUNDARY DETECTION BASED ON A HYBRID DEEP MODEL OF NOVELTY, HOMOGENEITY, REPETITION AND DURATION", IEEE SigPort, 2019. [Online]. Available: http://sigport.org/4527. Accessed: Sep. 21, 2019.
@article{4527-19,
url = {http://sigport.org/4527},
author = {Akira Maezawa },
publisher = {IEEE SigPort},
title = {MUSIC BOUNDARY DETECTION BASED ON A HYBRID DEEP MODEL OF NOVELTY, HOMOGENEITY, REPETITION AND DURATION},
year = {2019} }
TY - EJOUR
T1 - MUSIC BOUNDARY DETECTION BASED ON A HYBRID DEEP MODEL OF NOVELTY, HOMOGENEITY, REPETITION AND DURATION
AU - Akira Maezawa
PY - 2019
PB - IEEE SigPort
UR - http://sigport.org/4527
ER -
Akira Maezawa. (2019). MUSIC BOUNDARY DETECTION BASED ON A HYBRID DEEP MODEL OF NOVELTY, HOMOGENEITY, REPETITION AND DURATION. IEEE SigPort. http://sigport.org/4527
Akira Maezawa, 2019. MUSIC BOUNDARY DETECTION BASED ON A HYBRID DEEP MODEL OF NOVELTY, HOMOGENEITY, REPETITION AND DURATION. Available at: http://sigport.org/4527.
Akira Maezawa. (2019). "MUSIC BOUNDARY DETECTION BASED ON A HYBRID DEEP MODEL OF NOVELTY, HOMOGENEITY, REPETITION AND DURATION." Web.
1. Akira Maezawa. MUSIC BOUNDARY DETECTION BASED ON A HYBRID DEEP MODEL OF NOVELTY, HOMOGENEITY, REPETITION AND DURATION [Internet]. IEEE SigPort; 2019. Available from : http://sigport.org/4527

NON-INTRUSIVE SPEECH QUALITY ASSESSMENT USING NEURAL NETWORKS

Paper Details

Authors:
Ross Cutler, Ivan Tashev, Johannes Gehrke
Submitted On:
15 May 2019 - 10:26am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

MRS_ICASSP_Poster-v3.pdf

(28)

Subscribe

[1] Ross Cutler, Ivan Tashev, Johannes Gehrke, "NON-INTRUSIVE SPEECH QUALITY ASSESSMENT USING NEURAL NETWORKS", IEEE SigPort, 2019. [Online]. Available: http://sigport.org/4526. Accessed: Sep. 21, 2019.
@article{4526-19,
url = {http://sigport.org/4526},
author = {Ross Cutler; Ivan Tashev; Johannes Gehrke },
publisher = {IEEE SigPort},
title = {NON-INTRUSIVE SPEECH QUALITY ASSESSMENT USING NEURAL NETWORKS},
year = {2019} }
TY - EJOUR
T1 - NON-INTRUSIVE SPEECH QUALITY ASSESSMENT USING NEURAL NETWORKS
AU - Ross Cutler; Ivan Tashev; Johannes Gehrke
PY - 2019
PB - IEEE SigPort
UR - http://sigport.org/4526
ER -
Ross Cutler, Ivan Tashev, Johannes Gehrke. (2019). NON-INTRUSIVE SPEECH QUALITY ASSESSMENT USING NEURAL NETWORKS. IEEE SigPort. http://sigport.org/4526
Ross Cutler, Ivan Tashev, Johannes Gehrke, 2019. NON-INTRUSIVE SPEECH QUALITY ASSESSMENT USING NEURAL NETWORKS. Available at: http://sigport.org/4526.
Ross Cutler, Ivan Tashev, Johannes Gehrke. (2019). "NON-INTRUSIVE SPEECH QUALITY ASSESSMENT USING NEURAL NETWORKS." Web.
1. Ross Cutler, Ivan Tashev, Johannes Gehrke. NON-INTRUSIVE SPEECH QUALITY ASSESSMENT USING NEURAL NETWORKS [Internet]. IEEE SigPort; 2019. Available from : http://sigport.org/4526

The Effect of Light Source on ENF Based Video Forensics


ENF (Electric Network Frequency) oscillates around a nominal value (50/60 Hz) due to imbalance between consumed and generated power. The intensity of a light source powered by mains electricity varies depending on the ENF fluctuations. These fluctuations can be extracted from videos recorded in the presence of mains-powered source illumination. This work investigates how the quality of the ENF signal estimated from video is affected by different light source illumination, compression ratios, and by social media encoding.

Paper Details

Authors:
Saffet Vatansever, Ahmet Emir Dirik, Nasir Memon
Submitted On:
15 May 2019 - 8:17am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

ICASSP_Poster_v6.pdf

(39)

Subscribe

[1] Saffet Vatansever, Ahmet Emir Dirik, Nasir Memon, "The Effect of Light Source on ENF Based Video Forensics", IEEE SigPort, 2019. [Online]. Available: http://sigport.org/4524. Accessed: Sep. 21, 2019.
@article{4524-19,
url = {http://sigport.org/4524},
author = {Saffet Vatansever; Ahmet Emir Dirik; Nasir Memon },
publisher = {IEEE SigPort},
title = {The Effect of Light Source on ENF Based Video Forensics},
year = {2019} }
TY - EJOUR
T1 - The Effect of Light Source on ENF Based Video Forensics
AU - Saffet Vatansever; Ahmet Emir Dirik; Nasir Memon
PY - 2019
PB - IEEE SigPort
UR - http://sigport.org/4524
ER -
Saffet Vatansever, Ahmet Emir Dirik, Nasir Memon. (2019). The Effect of Light Source on ENF Based Video Forensics. IEEE SigPort. http://sigport.org/4524
Saffet Vatansever, Ahmet Emir Dirik, Nasir Memon, 2019. The Effect of Light Source on ENF Based Video Forensics. Available at: http://sigport.org/4524.
Saffet Vatansever, Ahmet Emir Dirik, Nasir Memon. (2019). "The Effect of Light Source on ENF Based Video Forensics." Web.
1. Saffet Vatansever, Ahmet Emir Dirik, Nasir Memon. The Effect of Light Source on ENF Based Video Forensics [Internet]. IEEE SigPort; 2019. Available from : http://sigport.org/4524

Network Adaptation Strategies for Learning New Classes without Forgetting the Original Ones


We address the problem of adding new classes to an existing classifier without hurting the original classes, when no access is allowed to any sample from the original classes. This problem arises frequently since models are often shared without their training data, due to privacy and data ownership concerns. We propose an easy-to-use approach that modifies the original classifier by retraining a suitable subset of layers using a linearly-tuned, knowledge-distillation regularization.

Paper Details

Authors:
Hagai Taitelbaum ; Gal Chechik ; Jacob Goldberger
Submitted On:
15 May 2019 - 7:51am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

https://ieeexplore.ieee.org/document/8682848

(32)

Subscribe

[1] Hagai Taitelbaum ; Gal Chechik ; Jacob Goldberger, "Network Adaptation Strategies for Learning New Classes without Forgetting the Original Ones", IEEE SigPort, 2019. [Online]. Available: http://sigport.org/4523. Accessed: Sep. 21, 2019.
@article{4523-19,
url = {http://sigport.org/4523},
author = {Hagai Taitelbaum ; Gal Chechik ; Jacob Goldberger },
publisher = {IEEE SigPort},
title = {Network Adaptation Strategies for Learning New Classes without Forgetting the Original Ones},
year = {2019} }
TY - EJOUR
T1 - Network Adaptation Strategies for Learning New Classes without Forgetting the Original Ones
AU - Hagai Taitelbaum ; Gal Chechik ; Jacob Goldberger
PY - 2019
PB - IEEE SigPort
UR - http://sigport.org/4523
ER -
Hagai Taitelbaum ; Gal Chechik ; Jacob Goldberger. (2019). Network Adaptation Strategies for Learning New Classes without Forgetting the Original Ones. IEEE SigPort. http://sigport.org/4523
Hagai Taitelbaum ; Gal Chechik ; Jacob Goldberger, 2019. Network Adaptation Strategies for Learning New Classes without Forgetting the Original Ones. Available at: http://sigport.org/4523.
Hagai Taitelbaum ; Gal Chechik ; Jacob Goldberger. (2019). "Network Adaptation Strategies for Learning New Classes without Forgetting the Original Ones." Web.
1. Hagai Taitelbaum ; Gal Chechik ; Jacob Goldberger. Network Adaptation Strategies for Learning New Classes without Forgetting the Original Ones [Internet]. IEEE SigPort; 2019. Available from : http://sigport.org/4523

AttS2S-VC: Sequence-to-Sequence Voice Conversion with Attention and Context Preservation Mechanisms


This paper describes a method based on a sequence-to-sequence learning (Seq2Seq) with attention and context preservation mechanism for voice conversion (VC) tasks. Seq2Seq has been outstanding at numerous tasks involving sequence modeling such as speech synthesis and recognition, machine translation, and image captioning.

Paper Details

Authors:
Submitted On:
15 May 2019 - 7:03am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

2019_05_ICASSP_KouTanaka.pdf

(47)

Subscribe

[1] , "AttS2S-VC: Sequence-to-Sequence Voice Conversion with Attention and Context Preservation Mechanisms", IEEE SigPort, 2019. [Online]. Available: http://sigport.org/4522. Accessed: Sep. 21, 2019.
@article{4522-19,
url = {http://sigport.org/4522},
author = { },
publisher = {IEEE SigPort},
title = {AttS2S-VC: Sequence-to-Sequence Voice Conversion with Attention and Context Preservation Mechanisms},
year = {2019} }
TY - EJOUR
T1 - AttS2S-VC: Sequence-to-Sequence Voice Conversion with Attention and Context Preservation Mechanisms
AU -
PY - 2019
PB - IEEE SigPort
UR - http://sigport.org/4522
ER -
. (2019). AttS2S-VC: Sequence-to-Sequence Voice Conversion with Attention and Context Preservation Mechanisms. IEEE SigPort. http://sigport.org/4522
, 2019. AttS2S-VC: Sequence-to-Sequence Voice Conversion with Attention and Context Preservation Mechanisms. Available at: http://sigport.org/4522.
. (2019). "AttS2S-VC: Sequence-to-Sequence Voice Conversion with Attention and Context Preservation Mechanisms." Web.
1. . AttS2S-VC: Sequence-to-Sequence Voice Conversion with Attention and Context Preservation Mechanisms [Internet]. IEEE SigPort; 2019. Available from : http://sigport.org/4522

Securing smartphone handwritten PIN codes with recurrent neural networks

Paper Details

Authors:
Gaël LE LAN, Vincent FREY
Submitted On:
15 May 2019 - 6:21am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

icassp_3080.pdf

(25)

Keywords

Subscribe

[1] Gaël LE LAN, Vincent FREY, "Securing smartphone handwritten PIN codes with recurrent neural networks", IEEE SigPort, 2019. [Online]. Available: http://sigport.org/4521. Accessed: Sep. 21, 2019.
@article{4521-19,
url = {http://sigport.org/4521},
author = {Gaël LE LAN; Vincent FREY },
publisher = {IEEE SigPort},
title = {Securing smartphone handwritten PIN codes with recurrent neural networks},
year = {2019} }
TY - EJOUR
T1 - Securing smartphone handwritten PIN codes with recurrent neural networks
AU - Gaël LE LAN; Vincent FREY
PY - 2019
PB - IEEE SigPort
UR - http://sigport.org/4521
ER -
Gaël LE LAN, Vincent FREY. (2019). Securing smartphone handwritten PIN codes with recurrent neural networks. IEEE SigPort. http://sigport.org/4521
Gaël LE LAN, Vincent FREY, 2019. Securing smartphone handwritten PIN codes with recurrent neural networks. Available at: http://sigport.org/4521.
Gaël LE LAN, Vincent FREY. (2019). "Securing smartphone handwritten PIN codes with recurrent neural networks." Web.
1. Gaël LE LAN, Vincent FREY. Securing smartphone handwritten PIN codes with recurrent neural networks [Internet]. IEEE SigPort; 2019. Available from : http://sigport.org/4521

Dimensional Analysis of Laughter in Female Conversational Speech


How do people hear laughter in expressive, unprompted speech? What is the range of expressivity and function of laughter in this speech, and how can laughter inform the recognition of higher-level expressive dimensions in a corpus? This paper presents a scalable method for collecting natural human description of laughter, transforming the description to a vector of quantifiable laughter dimensions, and deriving baseline classifiers for the different dimensions of expressive laughter.

Paper Details

Authors:
Mary Pietrowicz, Carla Agurto, Jonah Casebeer, Mark Hasegawa-Johnson, Karrie Karahalios
Submitted On:
15 May 2019 - 2:50am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

ICASSP_Laughter_Paper_36x48_v5_final_for_printing.pdf

(28)

Subscribe

[1] Mary Pietrowicz, Carla Agurto, Jonah Casebeer, Mark Hasegawa-Johnson, Karrie Karahalios, "Dimensional Analysis of Laughter in Female Conversational Speech", IEEE SigPort, 2019. [Online]. Available: http://sigport.org/4520. Accessed: Sep. 21, 2019.
@article{4520-19,
url = {http://sigport.org/4520},
author = {Mary Pietrowicz; Carla Agurto; Jonah Casebeer; Mark Hasegawa-Johnson; Karrie Karahalios },
publisher = {IEEE SigPort},
title = {Dimensional Analysis of Laughter in Female Conversational Speech},
year = {2019} }
TY - EJOUR
T1 - Dimensional Analysis of Laughter in Female Conversational Speech
AU - Mary Pietrowicz; Carla Agurto; Jonah Casebeer; Mark Hasegawa-Johnson; Karrie Karahalios
PY - 2019
PB - IEEE SigPort
UR - http://sigport.org/4520
ER -
Mary Pietrowicz, Carla Agurto, Jonah Casebeer, Mark Hasegawa-Johnson, Karrie Karahalios. (2019). Dimensional Analysis of Laughter in Female Conversational Speech. IEEE SigPort. http://sigport.org/4520
Mary Pietrowicz, Carla Agurto, Jonah Casebeer, Mark Hasegawa-Johnson, Karrie Karahalios, 2019. Dimensional Analysis of Laughter in Female Conversational Speech. Available at: http://sigport.org/4520.
Mary Pietrowicz, Carla Agurto, Jonah Casebeer, Mark Hasegawa-Johnson, Karrie Karahalios. (2019). "Dimensional Analysis of Laughter in Female Conversational Speech." Web.
1. Mary Pietrowicz, Carla Agurto, Jonah Casebeer, Mark Hasegawa-Johnson, Karrie Karahalios. Dimensional Analysis of Laughter in Female Conversational Speech [Internet]. IEEE SigPort; 2019. Available from : http://sigport.org/4520

END-TO-END FEEDBACK LOSS IN SPEECH CHAIN FRAMEWORK VIA STRAIGHT-THROUGH ESTIMATOR


The speech chain mechanism integrates automatic speech recognition (ASR) and text-to-speech synthesis (TTS) modules into a single cycle during training. In our previous work, we applied a speech chain mechanism as a semi-supervised learning. It provides the ability for ASR and TTS to assist each other when they receive unpaired data and let them infer the missing pair and optimize the model with reconstruction loss.

Paper Details

Authors:
Andros Tjandra, Sakriani Sakti, Satoshi Nakamura
Submitted On:
14 May 2019 - 8:26pm
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

ICASSP19_Poster_V1.pdf

(36)

Subscribe

[1] Andros Tjandra, Sakriani Sakti, Satoshi Nakamura, "END-TO-END FEEDBACK LOSS IN SPEECH CHAIN FRAMEWORK VIA STRAIGHT-THROUGH ESTIMATOR", IEEE SigPort, 2019. [Online]. Available: http://sigport.org/4519. Accessed: Sep. 21, 2019.
@article{4519-19,
url = {http://sigport.org/4519},
author = {Andros Tjandra; Sakriani Sakti; Satoshi Nakamura },
publisher = {IEEE SigPort},
title = {END-TO-END FEEDBACK LOSS IN SPEECH CHAIN FRAMEWORK VIA STRAIGHT-THROUGH ESTIMATOR},
year = {2019} }
TY - EJOUR
T1 - END-TO-END FEEDBACK LOSS IN SPEECH CHAIN FRAMEWORK VIA STRAIGHT-THROUGH ESTIMATOR
AU - Andros Tjandra; Sakriani Sakti; Satoshi Nakamura
PY - 2019
PB - IEEE SigPort
UR - http://sigport.org/4519
ER -
Andros Tjandra, Sakriani Sakti, Satoshi Nakamura. (2019). END-TO-END FEEDBACK LOSS IN SPEECH CHAIN FRAMEWORK VIA STRAIGHT-THROUGH ESTIMATOR. IEEE SigPort. http://sigport.org/4519
Andros Tjandra, Sakriani Sakti, Satoshi Nakamura, 2019. END-TO-END FEEDBACK LOSS IN SPEECH CHAIN FRAMEWORK VIA STRAIGHT-THROUGH ESTIMATOR. Available at: http://sigport.org/4519.
Andros Tjandra, Sakriani Sakti, Satoshi Nakamura. (2019). "END-TO-END FEEDBACK LOSS IN SPEECH CHAIN FRAMEWORK VIA STRAIGHT-THROUGH ESTIMATOR." Web.
1. Andros Tjandra, Sakriani Sakti, Satoshi Nakamura. END-TO-END FEEDBACK LOSS IN SPEECH CHAIN FRAMEWORK VIA STRAIGHT-THROUGH ESTIMATOR [Internet]. IEEE SigPort; 2019. Available from : http://sigport.org/4519

BLHUC: BAYESIAN LEARNING OF HIDDEN UNIT CONTRIBUTIONS FOR DEEP NEURAL NETWORK SPEAKER ADAPTATION

Paper Details

Authors:
Xurong Xie, Xunying Liu, Tan Lee, Shoukang Hu, Lan Wang
Submitted On:
14 May 2019 - 8:05pm
Short Link:
Type:
Event:
Presenter's Name:
Document Year:
Cite

Document Files

BLHUC BAYESIAN LEARNING OF HIDDEN UNIT CONTRIBUTIONS FOR DEEP NEURAL NETWORK SPEAKER ADAPTATION.pdf

(47)

Subscribe

[1] Xurong Xie, Xunying Liu, Tan Lee, Shoukang Hu, Lan Wang, "BLHUC: BAYESIAN LEARNING OF HIDDEN UNIT CONTRIBUTIONS FOR DEEP NEURAL NETWORK SPEAKER ADAPTATION", IEEE SigPort, 2019. [Online]. Available: http://sigport.org/4518. Accessed: Sep. 21, 2019.
@article{4518-19,
url = {http://sigport.org/4518},
author = {Xurong Xie; Xunying Liu; Tan Lee; Shoukang Hu; Lan Wang },
publisher = {IEEE SigPort},
title = {BLHUC: BAYESIAN LEARNING OF HIDDEN UNIT CONTRIBUTIONS FOR DEEP NEURAL NETWORK SPEAKER ADAPTATION},
year = {2019} }
TY - EJOUR
T1 - BLHUC: BAYESIAN LEARNING OF HIDDEN UNIT CONTRIBUTIONS FOR DEEP NEURAL NETWORK SPEAKER ADAPTATION
AU - Xurong Xie; Xunying Liu; Tan Lee; Shoukang Hu; Lan Wang
PY - 2019
PB - IEEE SigPort
UR - http://sigport.org/4518
ER -
Xurong Xie, Xunying Liu, Tan Lee, Shoukang Hu, Lan Wang. (2019). BLHUC: BAYESIAN LEARNING OF HIDDEN UNIT CONTRIBUTIONS FOR DEEP NEURAL NETWORK SPEAKER ADAPTATION. IEEE SigPort. http://sigport.org/4518
Xurong Xie, Xunying Liu, Tan Lee, Shoukang Hu, Lan Wang, 2019. BLHUC: BAYESIAN LEARNING OF HIDDEN UNIT CONTRIBUTIONS FOR DEEP NEURAL NETWORK SPEAKER ADAPTATION. Available at: http://sigport.org/4518.
Xurong Xie, Xunying Liu, Tan Lee, Shoukang Hu, Lan Wang. (2019). "BLHUC: BAYESIAN LEARNING OF HIDDEN UNIT CONTRIBUTIONS FOR DEEP NEURAL NETWORK SPEAKER ADAPTATION." Web.
1. Xurong Xie, Xunying Liu, Tan Lee, Shoukang Hu, Lan Wang. BLHUC: BAYESIAN LEARNING OF HIDDEN UNIT CONTRIBUTIONS FOR DEEP NEURAL NETWORK SPEAKER ADAPTATION [Internet]. IEEE SigPort; 2019. Available from : http://sigport.org/4518

Fast MVAE: Joint separation and classification of mixed sources based on multichannel variational autoencoder with auxiliary classifier

Paper Details

Authors:
Li Li, Hirokazu Kameoka, Shoji Makino
Submitted On:
14 May 2019 - 5:47pm
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

Li2019ICASSP05poster_v2.pdf

(41)

Subscribe

[1] Li Li, Hirokazu Kameoka, Shoji Makino, "Fast MVAE: Joint separation and classification of mixed sources based on multichannel variational autoencoder with auxiliary classifier", IEEE SigPort, 2019. [Online]. Available: http://sigport.org/4515. Accessed: Sep. 21, 2019.
@article{4515-19,
url = {http://sigport.org/4515},
author = {Li Li; Hirokazu Kameoka; Shoji Makino },
publisher = {IEEE SigPort},
title = {Fast MVAE: Joint separation and classification of mixed sources based on multichannel variational autoencoder with auxiliary classifier},
year = {2019} }
TY - EJOUR
T1 - Fast MVAE: Joint separation and classification of mixed sources based on multichannel variational autoencoder with auxiliary classifier
AU - Li Li; Hirokazu Kameoka; Shoji Makino
PY - 2019
PB - IEEE SigPort
UR - http://sigport.org/4515
ER -
Li Li, Hirokazu Kameoka, Shoji Makino. (2019). Fast MVAE: Joint separation and classification of mixed sources based on multichannel variational autoencoder with auxiliary classifier. IEEE SigPort. http://sigport.org/4515
Li Li, Hirokazu Kameoka, Shoji Makino, 2019. Fast MVAE: Joint separation and classification of mixed sources based on multichannel variational autoencoder with auxiliary classifier. Available at: http://sigport.org/4515.
Li Li, Hirokazu Kameoka, Shoji Makino. (2019). "Fast MVAE: Joint separation and classification of mixed sources based on multichannel variational autoencoder with auxiliary classifier." Web.
1. Li Li, Hirokazu Kameoka, Shoji Makino. Fast MVAE: Joint separation and classification of mixed sources based on multichannel variational autoencoder with auxiliary classifier [Internet]. IEEE SigPort; 2019. Available from : http://sigport.org/4515

Pages