Sorry, you need to enable JavaScript to visit this website.

ICASSP is the world’s largest and most comprehensive technical conference focused on signal processing and its applications. The 2019 conference will feature world-class presentations by internationally renowned speakers, cutting-edge session topics and provide a fantastic opportunity to network with like-minded professionals from around the world. Visit website.

Traditional parametric coding of speech facilitates low rate but provides poor reconstruction quality because of the inadequacy of the model used. We describe how a WaveNet generative speech model can be used to generate high quality speech from the bit stream of a standard parametric coder operating at 2.4 kb/s. We compare this parametric coder with a waveform coder based on the same generative model and show that approximating the signal waveform incurs a large rate penalty.

Categories:
51 Views

Recently, we developed a robust generalization of the Gaussian quasi-likelihood ratio test (GQLRT). This generalization, called measure-transformed GQLRT (MT-GQLRT), operates by selecting a Gaussian model that best empirically fits a transformed probability measure of the data. In this letter, a plug-in version of the MT-GQLRT is developed for robust detection of a random signal in nonspherical noise. The proposed detector is derived by plugging an empirical measure-transformed noise covariance, ob- tained from noise-only secondary data, into the MT-GQLRT.

Categories:
22 Views

Recently, several papers have demonstrated that neural networks (NN) are able to perform the feature extraction as part of the acoustic model. Motivated by the Gammatone feature extraction pipeline, in this paper we extend the waveform based NN model by a sec- ond level of time-convolutional element. The proposed extension generalizes the envelope extraction block, and allows the model to learn multi-resolutional representations.

Categories:
17 Views

Head movement is an integral part of face-to-face communications. It is important to investigate methodologies to generate naturalistic movements for conversational agents (CAs). The predominant method for head movement generation is using rules based on the meaning of the message. However, the variations of head movements by these methods are bounded by the predefined dictionary of gestures. Speech-driven methods offer an alternative approach, learning the relationship between speech and head movements from real recordings.

Categories:
36 Views

Audio fingerprinting systems are often well designed to cope with a range of broadband noise types however they cope less well when presented with additive noise containing sinusoidal components. This is largely due to the fact that in a short-time signal representa- tion (over periods of ≈ 20ms) these noise components are largely indistinguishable from salient components of the desirable signal that is to be fingerprinted.

Categories:
12 Views

Audio fingerprinting systems are often well designed to cope with a range of broadband noise types however they cope less well when presented with additive noise containing sinusoidal components. This is largely due to the fact that in a short-time signal representa- tion (over periods of ≈ 20ms) these noise components are largely indistinguishable from salient components of the desirable signal that is to be fingerprinted.

Categories:
26 Views

We consider the problem of super-resolution for sub-diffraction imaging. We adapt conventional Fourier ptychographic approaches, for the case where the images to be acquired have an underlying structured sparsity. We propose some sub-sampling strategies which can be easily adapted to existing ptychographic setups. We then use a novel technique called CoPRAM with some modifications, to recover sparse (and block sparse) images from sub-sampled ptychographic measurements.

Categories:
22 Views

In a recent paper [1], we introduced the concept of “Unlimited Sampling”. This unique approach circumvents the clipping or saturation problem in conventional analog-to-digital converters (ADCs) by considering a radically different ADC architecture which resets the input voltage before saturation. Such ADCs, also known as Self-Reset ADCs (SR-ADCs), allow for sensing modulo samples.

Categories:
169 Views

Pages