ICASSP is the world’s largest and most comprehensive technical conference focused on signal processing and its applications. The 2019 conference will feature world-class presentations by internationally renowned speakers, cutting-edge session topics and provide a fantastic opportunity to network with like-minded professionals from around the world. Visit website.
- Read more about WAVENET BASED LOW RATE SPEECH CODING
- Log in to post comments
Traditional parametric coding of speech facilitates low rate but provides poor reconstruction quality because of the inadequacy of the model used. We describe how a WaveNet generative speech model can be used to generate high quality speech from the bit stream of a standard parametric coder operating at 2.4 kb/s. We compare this parametric coder with a waveform coder based on the same generative model and show that approximating the signal waveform incurs a large rate penalty.
- Categories:
- Read more about PLUG-IN MEASURE-TRANSFORMED QUASI-LIKELIHOOD RATIO TEST FOR RANDOM SIGNAL DETECTION
- Log in to post comments
Recently, we developed a robust generalization of the Gaussian quasi-likelihood ratio test (GQLRT). This generalization, called measure-transformed GQLRT (MT-GQLRT), operates by selecting a Gaussian model that best empirically fits a transformed probability measure of the data. In this letter, a plug-in version of the MT-GQLRT is developed for robust detection of a random signal in nonspherical noise. The proposed detector is derived by plugging an empirical measure-transformed noise covariance, ob- tained from noise-only secondary data, into the MT-GQLRT.
- Categories:
- Read more about Acoustic modeling of speech waveform based on multi-resolution, neural network signal processing
- Log in to post comments
Recently, several papers have demonstrated that neural networks (NN) are able to perform the feature extraction as part of the acoustic model. Motivated by the Gammatone feature extraction pipeline, in this paper we extend the waveform based NN model by a sec- ond level of time-convolutional element. The proposed extension generalizes the envelope extraction block, and allows the model to learn multi-resolutional representations.
- Categories:
- Read more about Novel Realizations of Speech-driven Head Movements with Generative Adversarial Networks
- Log in to post comments
Head movement is an integral part of face-to-face communications. It is important to investigate methodologies to generate naturalistic movements for conversational agents (CAs). The predominant method for head movement generation is using rules based on the meaning of the message. However, the variations of head movements by these methods are bounded by the predefined dictionary of gestures. Speech-driven methods offer an alternative approach, learning the relationship between speech and head movements from real recordings.
- Categories:
- Read more about FOREGROUND HARMONIC NOISE REDUCTION FOR ROBUST AUDIO FINGERPRINTING
- Log in to post comments
Audio fingerprinting systems are often well designed to cope with a range of broadband noise types however they cope less well when presented with additive noise containing sinusoidal components. This is largely due to the fact that in a short-time signal representa- tion (over periods of ≈ 20ms) these noise components are largely indistinguishable from salient components of the desirable signal that is to be fingerprinted.
Draft_v2.pdf
- Categories:
- Read more about FOREGROUND HARMONIC NOISE REDUCTION FOR ROBUST AUDIO FINGERPRINTING
- Log in to post comments
Audio fingerprinting systems are often well designed to cope with a range of broadband noise types however they cope less well when presented with additive noise containing sinusoidal components. This is largely due to the fact that in a short-time signal representa- tion (over periods of ≈ 20ms) these noise components are largely indistinguishable from salient components of the desirable signal that is to be fingerprinted.
Draft_v2.pdf
- Categories:
- Read more about Sub-diffraction Imaging using Fourier Ptychography and Structured Sparsity
- Log in to post comments
We consider the problem of super-resolution for sub-diffraction imaging. We adapt conventional Fourier ptychographic approaches, for the case where the images to be acquired have an underlying structured sparsity. We propose some sub-sampling strategies which can be easily adapted to existing ptychographic setups. We then use a novel technique called CoPRAM with some modifications, to recover sparse (and block sparse) images from sub-sampled ptychographic measurements.
- Categories:
- Read more about Study Of Dense Network Approaches For Speech Emotion Recognition
- Log in to post comments
- Categories:
- Read more about Speech Prediction using an Adaptive Recurrent Neural Network with Application to Packet Loss Concealment
- Log in to post comments
- Categories:
- Read more about Unlimited Sampling of Sparse Signals
- Log in to post comments
In a recent paper [1], we introduced the concept of “Unlimited Sampling”. This unique approach circumvents the clipping or saturation problem in conventional analog-to-digital converters (ADCs) by considering a radically different ADC architecture which resets the input voltage before saturation. Such ADCs, also known as Self-Reset ADCs (SR-ADCs), allow for sensing modulo samples.
- Categories: