Sorry, you need to enable JavaScript to visit this website.

ICASSP 2018

ICASSP is the world’s largest and most comprehensive technical conference focused on signal processing and its applications. The 2019 conference will feature world-class presentations by internationally renowned speakers, cutting-edge session topics and provide a fantastic opportunity to network with like-minded professionals from around the world. Visit website.

Acoustic modeling of speech waveform based on multi-resolution, neural network signal processing


Recently, several papers have demonstrated that neural networks (NN) are able to perform the feature extraction as part of the acoustic model. Motivated by the Gammatone feature extraction pipeline, in this paper we extend the waveform based NN model by a sec- ond level of time-convolutional element. The proposed extension generalizes the envelope extraction block, and allows the model to learn multi-resolutional representations.

Paper Details

Authors:
Zoltán Tüske, Ralf Schlüter, Hermann Ney
Submitted On:
2 May 2018 - 3:00pm
Short Link:
Type:
Event:
Presenter's Name:
Document Year:
Cite

Document Files

slides-template.pdf

(238)

Subscribe

[1] Zoltán Tüske, Ralf Schlüter, Hermann Ney, "Acoustic modeling of speech waveform based on multi-resolution, neural network signal processing", IEEE SigPort, 2018. [Online]. Available: http://sigport.org/3199. Accessed: May. 26, 2019.
@article{3199-18,
url = {http://sigport.org/3199},
author = {Zoltán Tüske; Ralf Schlüter; Hermann Ney },
publisher = {IEEE SigPort},
title = {Acoustic modeling of speech waveform based on multi-resolution, neural network signal processing},
year = {2018} }
TY - EJOUR
T1 - Acoustic modeling of speech waveform based on multi-resolution, neural network signal processing
AU - Zoltán Tüske; Ralf Schlüter; Hermann Ney
PY - 2018
PB - IEEE SigPort
UR - http://sigport.org/3199
ER -
Zoltán Tüske, Ralf Schlüter, Hermann Ney. (2018). Acoustic modeling of speech waveform based on multi-resolution, neural network signal processing. IEEE SigPort. http://sigport.org/3199
Zoltán Tüske, Ralf Schlüter, Hermann Ney, 2018. Acoustic modeling of speech waveform based on multi-resolution, neural network signal processing. Available at: http://sigport.org/3199.
Zoltán Tüske, Ralf Schlüter, Hermann Ney. (2018). "Acoustic modeling of speech waveform based on multi-resolution, neural network signal processing." Web.
1. Zoltán Tüske, Ralf Schlüter, Hermann Ney. Acoustic modeling of speech waveform based on multi-resolution, neural network signal processing [Internet]. IEEE SigPort; 2018. Available from : http://sigport.org/3199

Novel Realizations of Speech-driven Head Movements with Generative Adversarial Networks


Head movement is an integral part of face-to-face communications. It is important to investigate methodologies to generate naturalistic movements for conversational agents (CAs). The predominant method for head movement generation is using rules based on the meaning of the message. However, the variations of head movements by these methods are bounded by the predefined dictionary of gestures. Speech-driven methods offer an alternative approach, learning the relationship between speech and head movements from real recordings.

Paper Details

Authors:
Najmeh Sadoughi, Carlos Busso
Submitted On:
1 May 2018 - 8:43pm
Short Link:
Type:
Event:
Document Year:
Cite

Document Files

Sadoughi_2018-poster.pdf

(125)

Subscribe

[1] Najmeh Sadoughi, Carlos Busso, "Novel Realizations of Speech-driven Head Movements with Generative Adversarial Networks", IEEE SigPort, 2018. [Online]. Available: http://sigport.org/3198. Accessed: May. 26, 2019.
@article{3198-18,
url = {http://sigport.org/3198},
author = {Najmeh Sadoughi; Carlos Busso },
publisher = {IEEE SigPort},
title = {Novel Realizations of Speech-driven Head Movements with Generative Adversarial Networks},
year = {2018} }
TY - EJOUR
T1 - Novel Realizations of Speech-driven Head Movements with Generative Adversarial Networks
AU - Najmeh Sadoughi; Carlos Busso
PY - 2018
PB - IEEE SigPort
UR - http://sigport.org/3198
ER -
Najmeh Sadoughi, Carlos Busso. (2018). Novel Realizations of Speech-driven Head Movements with Generative Adversarial Networks. IEEE SigPort. http://sigport.org/3198
Najmeh Sadoughi, Carlos Busso, 2018. Novel Realizations of Speech-driven Head Movements with Generative Adversarial Networks. Available at: http://sigport.org/3198.
Najmeh Sadoughi, Carlos Busso. (2018). "Novel Realizations of Speech-driven Head Movements with Generative Adversarial Networks." Web.
1. Najmeh Sadoughi, Carlos Busso. Novel Realizations of Speech-driven Head Movements with Generative Adversarial Networks [Internet]. IEEE SigPort; 2018. Available from : http://sigport.org/3198

FOREGROUND HARMONIC NOISE REDUCTION FOR ROBUST AUDIO FINGERPRINTING


Audio fingerprinting systems are often well designed to cope with a range of broadband noise types however they cope less well when presented with additive noise containing sinusoidal components. This is largely due to the fact that in a short-time signal representa- tion (over periods of ≈ 20ms) these noise components are largely indistinguishable from salient components of the desirable signal that is to be fingerprinted.

Paper Details

Authors:
Submitted On:
30 April 2018 - 7:27pm
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

Draft_v2.pdf

(133)

Subscribe

[1] , "FOREGROUND HARMONIC NOISE REDUCTION FOR ROBUST AUDIO FINGERPRINTING", IEEE SigPort, 2018. [Online]. Available: http://sigport.org/3197. Accessed: May. 26, 2019.
@article{3197-18,
url = {http://sigport.org/3197},
author = { },
publisher = {IEEE SigPort},
title = {FOREGROUND HARMONIC NOISE REDUCTION FOR ROBUST AUDIO FINGERPRINTING},
year = {2018} }
TY - EJOUR
T1 - FOREGROUND HARMONIC NOISE REDUCTION FOR ROBUST AUDIO FINGERPRINTING
AU -
PY - 2018
PB - IEEE SigPort
UR - http://sigport.org/3197
ER -
. (2018). FOREGROUND HARMONIC NOISE REDUCTION FOR ROBUST AUDIO FINGERPRINTING. IEEE SigPort. http://sigport.org/3197
, 2018. FOREGROUND HARMONIC NOISE REDUCTION FOR ROBUST AUDIO FINGERPRINTING. Available at: http://sigport.org/3197.
. (2018). "FOREGROUND HARMONIC NOISE REDUCTION FOR ROBUST AUDIO FINGERPRINTING." Web.
1. . FOREGROUND HARMONIC NOISE REDUCTION FOR ROBUST AUDIO FINGERPRINTING [Internet]. IEEE SigPort; 2018. Available from : http://sigport.org/3197

FOREGROUND HARMONIC NOISE REDUCTION FOR ROBUST AUDIO FINGERPRINTING


Audio fingerprinting systems are often well designed to cope with a range of broadband noise types however they cope less well when presented with additive noise containing sinusoidal components. This is largely due to the fact that in a short-time signal representa- tion (over periods of ≈ 20ms) these noise components are largely indistinguishable from salient components of the desirable signal that is to be fingerprinted.

Paper Details

Authors:
Submitted On:
30 April 2018 - 7:27pm
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

Draft_v2.pdf

(144)

Subscribe

[1] , "FOREGROUND HARMONIC NOISE REDUCTION FOR ROBUST AUDIO FINGERPRINTING", IEEE SigPort, 2018. [Online]. Available: http://sigport.org/3196. Accessed: May. 26, 2019.
@article{3196-18,
url = {http://sigport.org/3196},
author = { },
publisher = {IEEE SigPort},
title = {FOREGROUND HARMONIC NOISE REDUCTION FOR ROBUST AUDIO FINGERPRINTING},
year = {2018} }
TY - EJOUR
T1 - FOREGROUND HARMONIC NOISE REDUCTION FOR ROBUST AUDIO FINGERPRINTING
AU -
PY - 2018
PB - IEEE SigPort
UR - http://sigport.org/3196
ER -
. (2018). FOREGROUND HARMONIC NOISE REDUCTION FOR ROBUST AUDIO FINGERPRINTING. IEEE SigPort. http://sigport.org/3196
, 2018. FOREGROUND HARMONIC NOISE REDUCTION FOR ROBUST AUDIO FINGERPRINTING. Available at: http://sigport.org/3196.
. (2018). "FOREGROUND HARMONIC NOISE REDUCTION FOR ROBUST AUDIO FINGERPRINTING." Web.
1. . FOREGROUND HARMONIC NOISE REDUCTION FOR ROBUST AUDIO FINGERPRINTING [Internet]. IEEE SigPort; 2018. Available from : http://sigport.org/3196

Sub-diffraction Imaging using Fourier Ptychography and Structured Sparsity


We consider the problem of super-resolution for sub-diffraction imaging. We adapt conventional Fourier ptychographic approaches, for the case where the images to be acquired have an underlying structured sparsity. We propose some sub-sampling strategies which can be easily adapted to existing ptychographic setups. We then use a novel technique called CoPRAM with some modifications, to recover sparse (and block sparse) images from sub-sampled ptychographic measurements.

Paper Details

Authors:
Gauri Jagatap, Zhengyu Chen, Chinmay Hegde, Namrata Vaswani
Submitted On:
30 April 2018 - 2:40pm
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

slides-icassp18-nofigs.pdf

(141)

Keywords

Additional Categories

Subscribe

[1] Gauri Jagatap, Zhengyu Chen, Chinmay Hegde, Namrata Vaswani, "Sub-diffraction Imaging using Fourier Ptychography and Structured Sparsity", IEEE SigPort, 2018. [Online]. Available: http://sigport.org/3195. Accessed: May. 26, 2019.
@article{3195-18,
url = {http://sigport.org/3195},
author = {Gauri Jagatap; Zhengyu Chen; Chinmay Hegde; Namrata Vaswani },
publisher = {IEEE SigPort},
title = {Sub-diffraction Imaging using Fourier Ptychography and Structured Sparsity},
year = {2018} }
TY - EJOUR
T1 - Sub-diffraction Imaging using Fourier Ptychography and Structured Sparsity
AU - Gauri Jagatap; Zhengyu Chen; Chinmay Hegde; Namrata Vaswani
PY - 2018
PB - IEEE SigPort
UR - http://sigport.org/3195
ER -
Gauri Jagatap, Zhengyu Chen, Chinmay Hegde, Namrata Vaswani. (2018). Sub-diffraction Imaging using Fourier Ptychography and Structured Sparsity. IEEE SigPort. http://sigport.org/3195
Gauri Jagatap, Zhengyu Chen, Chinmay Hegde, Namrata Vaswani, 2018. Sub-diffraction Imaging using Fourier Ptychography and Structured Sparsity. Available at: http://sigport.org/3195.
Gauri Jagatap, Zhengyu Chen, Chinmay Hegde, Namrata Vaswani. (2018). "Sub-diffraction Imaging using Fourier Ptychography and Structured Sparsity." Web.
1. Gauri Jagatap, Zhengyu Chen, Chinmay Hegde, Namrata Vaswani. Sub-diffraction Imaging using Fourier Ptychography and Structured Sparsity [Internet]. IEEE SigPort; 2018. Available from : http://sigport.org/3195

Study Of Dense Network Approaches For Speech Emotion Recognition

Paper Details

Authors:
Mohammed Abdelwahab, Carlos Busso
Submitted On:
30 April 2018 - 11:45am
Short Link:
Type:
Event:
Paper Code:
Document Year:
Cite

Document Files

Abdelwahab_ICASSP_2018-poster.pdf

(137)

Subscribe

[1] Mohammed Abdelwahab, Carlos Busso, " Study Of Dense Network Approaches For Speech Emotion Recognition", IEEE SigPort, 2018. [Online]. Available: http://sigport.org/3192. Accessed: May. 26, 2019.
@article{3192-18,
url = {http://sigport.org/3192},
author = {Mohammed Abdelwahab; Carlos Busso },
publisher = {IEEE SigPort},
title = { Study Of Dense Network Approaches For Speech Emotion Recognition},
year = {2018} }
TY - EJOUR
T1 - Study Of Dense Network Approaches For Speech Emotion Recognition
AU - Mohammed Abdelwahab; Carlos Busso
PY - 2018
PB - IEEE SigPort
UR - http://sigport.org/3192
ER -
Mohammed Abdelwahab, Carlos Busso. (2018). Study Of Dense Network Approaches For Speech Emotion Recognition. IEEE SigPort. http://sigport.org/3192
Mohammed Abdelwahab, Carlos Busso, 2018. Study Of Dense Network Approaches For Speech Emotion Recognition. Available at: http://sigport.org/3192.
Mohammed Abdelwahab, Carlos Busso. (2018). " Study Of Dense Network Approaches For Speech Emotion Recognition." Web.
1. Mohammed Abdelwahab, Carlos Busso. Study Of Dense Network Approaches For Speech Emotion Recognition [Internet]. IEEE SigPort; 2018. Available from : http://sigport.org/3192

Speech Prediction using an Adaptive Recurrent Neural Network with Application to Packet Loss Concealment

Paper Details

Authors:
Submitted On:
30 April 2018 - 10:36am
Short Link:
Type:
Event:
Presenter's Name:
Document Year:
Cite

Document Files

main.pdf

(217)

Subscribe

[1] , "Speech Prediction using an Adaptive Recurrent Neural Network with Application to Packet Loss Concealment", IEEE SigPort, 2018. [Online]. Available: http://sigport.org/3189. Accessed: May. 26, 2019.
@article{3189-18,
url = {http://sigport.org/3189},
author = { },
publisher = {IEEE SigPort},
title = {Speech Prediction using an Adaptive Recurrent Neural Network with Application to Packet Loss Concealment},
year = {2018} }
TY - EJOUR
T1 - Speech Prediction using an Adaptive Recurrent Neural Network with Application to Packet Loss Concealment
AU -
PY - 2018
PB - IEEE SigPort
UR - http://sigport.org/3189
ER -
. (2018). Speech Prediction using an Adaptive Recurrent Neural Network with Application to Packet Loss Concealment. IEEE SigPort. http://sigport.org/3189
, 2018. Speech Prediction using an Adaptive Recurrent Neural Network with Application to Packet Loss Concealment. Available at: http://sigport.org/3189.
. (2018). "Speech Prediction using an Adaptive Recurrent Neural Network with Application to Packet Loss Concealment." Web.
1. . Speech Prediction using an Adaptive Recurrent Neural Network with Application to Packet Loss Concealment [Internet]. IEEE SigPort; 2018. Available from : http://sigport.org/3189

Unlimited Sampling of Sparse Signals


In a recent paper [1], we introduced the concept of “Unlimited Sampling”. This unique approach circumvents the clipping or saturation problem in conventional analog-to-digital converters (ADCs) by considering a radically different ADC architecture which resets the input voltage before saturation. Such ADCs, also known as Self-Reset ADCs (SR-ADCs), allow for sensing modulo samples.

Paper Details

Authors:
Felix Krahmer, Ramesh Raskar
Submitted On:
30 April 2018 - 2:45am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

AB_ICASSP 2018.pdf

(22)

Subscribe

[1] Felix Krahmer, Ramesh Raskar, "Unlimited Sampling of Sparse Signals", IEEE SigPort, 2018. [Online]. Available: http://sigport.org/3188. Accessed: May. 26, 2019.
@article{3188-18,
url = {http://sigport.org/3188},
author = {Felix Krahmer; Ramesh Raskar },
publisher = {IEEE SigPort},
title = {Unlimited Sampling of Sparse Signals},
year = {2018} }
TY - EJOUR
T1 - Unlimited Sampling of Sparse Signals
AU - Felix Krahmer; Ramesh Raskar
PY - 2018
PB - IEEE SigPort
UR - http://sigport.org/3188
ER -
Felix Krahmer, Ramesh Raskar. (2018). Unlimited Sampling of Sparse Signals. IEEE SigPort. http://sigport.org/3188
Felix Krahmer, Ramesh Raskar, 2018. Unlimited Sampling of Sparse Signals. Available at: http://sigport.org/3188.
Felix Krahmer, Ramesh Raskar. (2018). "Unlimited Sampling of Sparse Signals." Web.
1. Felix Krahmer, Ramesh Raskar. Unlimited Sampling of Sparse Signals [Internet]. IEEE SigPort; 2018. Available from : http://sigport.org/3188

Cyborg Speech: Deep Multilingual Speech Synthesis for Generating Segmental Foreign Accent with Natural Prosody


We describe a new application of deep-learning-based speech synthesis, namely multilingual speech synthesis for generating controllable foreign accent. Specifically, we train a DBLSTM-based acoustic model on non-accented multilingual speech recordings from a speaker native in several languages. By copying durations and pitch contours from a pre-recorded utterance of the desired prompt, natural prosody is achieved. We call this paradigm "cyborg speech" as it combines human and machine speech parameters.

Paper Details

Authors:
Jaime Lorenzo-Trueba, Mariko Kondo, Junichi Yamagishi
Submitted On:
29 April 2018 - 1:59pm
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

Cyborg Speech presentation slides

(166)

Subscribe

[1] Jaime Lorenzo-Trueba, Mariko Kondo, Junichi Yamagishi, "Cyborg Speech: Deep Multilingual Speech Synthesis for Generating Segmental Foreign Accent with Natural Prosody", IEEE SigPort, 2018. [Online]. Available: http://sigport.org/3187. Accessed: May. 26, 2019.
@article{3187-18,
url = {http://sigport.org/3187},
author = {Jaime Lorenzo-Trueba; Mariko Kondo; Junichi Yamagishi },
publisher = {IEEE SigPort},
title = {Cyborg Speech: Deep Multilingual Speech Synthesis for Generating Segmental Foreign Accent with Natural Prosody},
year = {2018} }
TY - EJOUR
T1 - Cyborg Speech: Deep Multilingual Speech Synthesis for Generating Segmental Foreign Accent with Natural Prosody
AU - Jaime Lorenzo-Trueba; Mariko Kondo; Junichi Yamagishi
PY - 2018
PB - IEEE SigPort
UR - http://sigport.org/3187
ER -
Jaime Lorenzo-Trueba, Mariko Kondo, Junichi Yamagishi. (2018). Cyborg Speech: Deep Multilingual Speech Synthesis for Generating Segmental Foreign Accent with Natural Prosody. IEEE SigPort. http://sigport.org/3187
Jaime Lorenzo-Trueba, Mariko Kondo, Junichi Yamagishi, 2018. Cyborg Speech: Deep Multilingual Speech Synthesis for Generating Segmental Foreign Accent with Natural Prosody. Available at: http://sigport.org/3187.
Jaime Lorenzo-Trueba, Mariko Kondo, Junichi Yamagishi. (2018). "Cyborg Speech: Deep Multilingual Speech Synthesis for Generating Segmental Foreign Accent with Natural Prosody." Web.
1. Jaime Lorenzo-Trueba, Mariko Kondo, Junichi Yamagishi. Cyborg Speech: Deep Multilingual Speech Synthesis for Generating Segmental Foreign Accent with Natural Prosody [Internet]. IEEE SigPort; 2018. Available from : http://sigport.org/3187

Invisible Geo-Location Signature in a Single Image


Geo-tagging images of interest is increasingly important to law enforcement, national security, and journalism. Many images today do not carry location tags that are trustworthy and resilient to tampering; and the landmark-based visual clues may not be readily present in every image, especially in those taken indoors. In this paper, we exploit an invisible signature from the power grid, the Electric Network Frequency (ENF) signal, which can be inherently recorded in a sensing stream at the time of capturing and carries useful location information.

Paper Details

Authors:
Submitted On:
1 March 2019 - 1:27pm
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

ICASSP18_presentation_v2.pdf

(149)

Subscribe

[1] , "Invisible Geo-Location Signature in a Single Image", IEEE SigPort, 2018. [Online]. Available: http://sigport.org/3186. Accessed: May. 26, 2019.
@article{3186-18,
url = {http://sigport.org/3186},
author = { },
publisher = {IEEE SigPort},
title = {Invisible Geo-Location Signature in a Single Image},
year = {2018} }
TY - EJOUR
T1 - Invisible Geo-Location Signature in a Single Image
AU -
PY - 2018
PB - IEEE SigPort
UR - http://sigport.org/3186
ER -
. (2018). Invisible Geo-Location Signature in a Single Image. IEEE SigPort. http://sigport.org/3186
, 2018. Invisible Geo-Location Signature in a Single Image. Available at: http://sigport.org/3186.
. (2018). "Invisible Geo-Location Signature in a Single Image." Web.
1. . Invisible Geo-Location Signature in a Single Image [Internet]. IEEE SigPort; 2018. Available from : http://sigport.org/3186

Pages