ICASSP 2018

ICASSP is the world’s largest and most comprehensive technical conference focused on signal processing and its applications. The 2019 conference will feature world-class presentations by internationally renowned speakers, cutting-edge session topics and provide a fantastic opportunity to network with like-minded professionals from around the world. Visit website.

WAVENET BASED LOW RATE SPEECH CODING

Read more about WAVENET BASED LOW RATE SPEECH CODING
Log in to post comments

Traditional parametric coding of speech facilitates low rate but provides poor reconstruction quality because of the inadequacy of the model used. We describe how a WaveNet generative speech model can be used to generate high quality speech from the bit stream of a standard parametric coder operating at 2.4 kb/s. We compare this parametric coder with a waveform coder based on the same generative model and show that approximating the signal waveform incurs a large rate penalty.

WaveNetCoding_b.pdf

WaveNetCoding_b.pdf (656)

Categories:: Audio Coding

54 Views

PLUG-IN MEASURE-TRANSFORMED QUASI-LIKELIHOOD RATIO TEST FOR RANDOM SIGNAL DETECTION

Read more about PLUG-IN MEASURE-TRANSFORMED QUASI-LIKELIHOOD RATIO TEST FOR RANDOM SIGNAL DETECTION
Log in to post comments

Recently, we developed a robust generalization of the Gaussian quasi-likelihood ratio test (GQLRT). This generalization, called measure-transformed GQLRT (MT-GQLRT), operates by selecting a Gaussian model that best empirically fits a transformed probability measure of the data. In this letter, a plug-in version of the MT-GQLRT is developed for robust detection of a random signal in nonspherical noise. The proposed detector is derived by plugging an empirical measure-transformed noise covariance, ob- tained from noise-only secondary data, into the MT-GQLRT.

ICASSP_2018_POSTER_VER_3.pdf

ICASSP_2018_POSTER_VER_3.pdf (564)

Categories:: Statistical Signal Processing

24 Views

Acoustic modeling of speech waveform based on multi-resolution, neural network signal processing

Recently, several papers have demonstrated that neural networks (NN) are able to perform the feature extraction as part of the acoustic model. Motivated by the Gammatone feature extraction pipeline, in this paper we extend the waveform based NN model by a sec- ond level of time-convolutional element. The proposed extension generalizes the envelope extraction block, and allows the model to learn multi-resolutional representations.

slides-template.pdf

slides-template.pdf (622)

Categories:: Acoustic Modeling for Automatic Speech Recognition (SPE-RECO)

20 Views

Novel Realizations of Speech-driven Head Movements with Generative Adversarial Networks

Head movement is an integral part of face-to-face communications. It is important to investigate methodologies to generate naturalistic movements for conversational agents (CAs). The predominant method for head movement generation is using rules based on the meaning of the message. However, the variations of head movements by these methods are bounded by the predefined dictionary of gestures. Speech-driven methods offer an alternative approach, learning the relationship between speech and head movements from real recordings.

Sadoughi_2018-poster.pdf

Sadoughi_2018-poster.pdf (508)

Categories:: Other applications of machine learning (MLR-APPL)

38 Views

FOREGROUND HARMONIC NOISE REDUCTION FOR ROBUST AUDIO FINGERPRINTING

Read more about FOREGROUND HARMONIC NOISE REDUCTION FOR ROBUST AUDIO FINGERPRINTING
Log in to post comments

Audio fingerprinting systems are often well designed to cope with a range of broadband noise types however they cope less well when presented with additive noise containing sinusoidal components. This is largely due to the fact that in a short-time signal representa- tion (over periods of ≈ 20ms) these noise components are largely indistinguishable from salient components of the desirable signal that is to be fingerprinted.

Draft_v2.pdf

Draft_v2.pdf (895)

Categories:: Audio for Multimedia

13 Views

FOREGROUND HARMONIC NOISE REDUCTION FOR ROBUST AUDIO FINGERPRINTING

Read more about FOREGROUND HARMONIC NOISE REDUCTION FOR ROBUST AUDIO FINGERPRINTING
Log in to post comments

Draft_v2.pdf

Draft_v2.pdf (553)

Categories:: Audio for Multimedia

28 Views

Sub-diffraction Imaging using Fourier Ptychography and Structured Sparsity

Read more about Sub-diffraction Imaging using Fourier Ptychography and Structured Sparsity
Log in to post comments

We consider the problem of super-resolution for sub-diffraction imaging. We adapt conventional Fourier ptychographic approaches, for the case where the images to be acquired have an underlying structured sparsity. We propose some sub-sampling strategies which can be easily adapted to existing ptychographic setups. We then use a novel technique called CoPRAM with some modifications, to recover sparse (and block sparse) images from sub-sampled ptychographic measurements.

slides-icassp18-nofigs.pdf

slides-icassp18-nofigs.pdf (590)

Categories:: Image/Video Storage, Retrieval

28 Views

Study Of Dense Network Approaches For Speech Emotion Recognition

Read more about Study Of Dense Network Approaches For Speech Emotion Recognition
Log in to post comments

Abdelwahab_ICASSP_2018-poster.pdf

Abdelwahab_ICASSP_2018-poster.pdf (496)

Categories:: Audio and Acoustic Signal Processing

5 Views

Speech Prediction using an Adaptive Recurrent Neural Network with Application to Packet Loss Concealment

main.pdf

main.pdf (709)

Categories:: Audio and Acoustic Signal Processing

256 Views

Unlimited Sampling of Sparse Signals

Read more about Unlimited Sampling of Sparse Signals
Log in to post comments

In a recent paper [1], we introduced the concept of “Unlimited Sampling”. This unique approach circumvents the clipping or saturation problem in conventional analog-to-digital converters (ADCs) by considering a radically different ADC architecture which resets the input voltage before saturation. Such ADCs, also known as Self-Reset ADCs (SR-ADCs), allow for sensing modulo samples.

AB_ICASSP 2018.pdf

AB_ICASSP 2018.pdf (417)

Categories:: Sampling and Reconstruction

170 Views

Pages