Sorry, you need to enable JavaScript to visit this website.

ICASSP is the world’s largest and most comprehensive technical conference focused on signal processing and its applications. The ICASSP 2020 conference will feature world-class presentations by internationally renowned speakers, cutting-edge session topics and provide a fantastic opportunity to network with like-minded professionals from around the world. Visit website.

Many signal processing methods break the target signal into overlapping patches, process them separately, and then stitch them back to produce an output. How to merge the resulting patches at the overlaps is central to such methods. We propose a novel framework for this type of problem based on the idea that estimated patches should coincide at the overlaps (consensus), and develop an algorithm for solving the general problem. In particular, an efficient method for projecting patches onto the consensus constraint is presented.

Categories:
20 Views

With the availability of large data, ASRs perform well on native English but poorly for non-native English data.
Training nonnative ASRs or adapting a native English ASR is often limited by the availability of data, particularly for low resource scenarios. A typical HMM-DNN based ASR decoding requires pseudo-likelihood of states given an acoustic observation, which changes significantly from native to non-native speech due to accent variation.

Categories:
16 Views

Dereverberation is often performed in the time-frequency domain using mostly deep learning approaches. Time-frequency domain processing, however, may not be necessary when reverberation is modeled by the convolution operation. In this paper, we investigate whether deverberation can be effectively performed in the frequency-domain by estimating the complex frequency response of a room impulse response. More specifically, we develop a joint learning framework that uses frequency-domain estimates of the late reverberant response to assist with estimating the direct and early response.

Categories:
32 Views

This paper studies the security issue of a gossip-based distributed projected gradient (DPG) algorithm, when it is applied for solving a decentralized multi-agent optimization. It is known that the gossip-based DPG algorithm is vulnerable to insider attacks because each agent locally estimates its (sub)gradient without any supervision. This work leverages the convolutional neural network (CNN) to perform the detection and localization of the insider attackers.

Categories:
21 Views

Epilepsy affects approximately 1% of the world’s population. Semiology of epileptic seizures contain major clinical signs to classify epilepsy syndromes currently evaluated by epileptologists by simple visual inspection of video. There is a necessity to create automatic and semiautomatic methods for seizure detection and classification to better support patient monitoring management and diagnostic decisions. One of the current promising approaches are the marker-less computer-vision techniques.

Categories:
24 Views

We address the problem of detection, in the frequency domain, of a M-dimensional time series modeled as the output of a M × K MIMO filter driven by a K-dimensional Gaussian white noise, and disturbed by an additive M-dimensional Gaussian col- ored noise. We consider the study of test statistics based of the Spectral Coherence Matrix (SCM) obtained as renormalization of the smoothed periodogram matrix of the observed time series over N samples, and with smoothing span B.

Categories:
58 Views

This paper is concerned with the estimation of unknown drift functions of stochastic differential equations (SDEs) from observations of their sample paths. We propose to formulate this as a non-parametric Gaussian process regression problem and use an Itô-Taylor expansion for approximating the SDE. To address the computational complexity problem of Gaussian process regression, we cast the model in an equivalent state-space representation, such that (non-linear) Kalman filters and smoothers can be used.

Categories:
16 Views

Low latency video streaming of volumetric content is an emerging technology to enable immersive media experiences on mobile devices. Unlike 3DoF scenarios where users are restricted to changes of their head orientation at a single position, volumetric content allows users to move freely within the scene in 6DoF. Although the processing power of mobile devices has increased considerably, streaming volumetric content directly to such devices is still challenging. High-quality volumetric content requires significant data rate and network bandwidth.

Categories:
273 Views

Pages