ICASSP 2020

ICASSP is the world’s largest and most comprehensive technical conference focused on signal processing and its applications. The ICASSP 2020 conference will feature world-class presentations by internationally renowned speakers, cutting-edge session topics and provide a fantastic opportunity to network with like-minded professionals from around the world. Visit website.

PACO and PCO-DCT: Patch Consensus and Its Application To Inpainting

Read more about PACO and PCO-DCT: Patch Consensus and Its Application To Inpainting
Log in to post comments

Many signal processing methods break the target signal into overlapping patches, process them separately, and then stitch them back to produce an output. How to merge the resulting patches at the overlaps is central to such methods. We propose a novel framework for this type of problem based on the idea that estimated patches should coincide at the overlaps (consensus), and develop an algorithm for solving the general problem. In particular, an efficient method for projecting patches onto the consensus constraint is presented.

RamirezHounie2020.pdf

Presentation (439)

Categories:: Signal and System Modeling, Representation and Estimation
Image/Video Processing
Audio Processing Systems

22 Views

Modeling uncertainty in predicting emotional attributes from spontaneous speech

Read more about Modeling uncertainty in predicting emotional attributes from spontaneous speech
Log in to post comments

Kusha_Sridhar-ICASSP_2020.pdf

Kusha_Sridhar-ICASSP_2020.pdf (491)

Categories:: Speech Analysis (SPE-ANLS)

33 Views

PSEUDO LIKELIHOOD CORRECTION TECHNIQUE FOR LOW RESOURCE ACCENTED ASR

Read more about PSEUDO LIKELIHOOD CORRECTION TECHNIQUE FOR LOW RESOURCE ACCENTED ASR
Log in to post comments

With the availability of large data, ASRs perform well on native English but poorly for non-native English data.
Training nonnative ASRs or adapting a native English ASR is often limited by the availability of data, particularly for low resource scenarios. A typical HMM-DNN based ASR decoding requires pseudo-likelihood of states given an acoustic observation, which changes significantly from native to non-native speech due to accent variation.

ICASSP_2020_avni_final.pdf

Presentation_slides (386)

Categories:: Speech Adaptation/Normalization (SPE-ADAP)

25 Views

A RETURN TO DEREVERBERATION IN THE FREQUENCY DOMAIN USING A JOINT LEARNING APPROACH

Read more about A RETURN TO DEREVERBERATION IN THE FREQUENCY DOMAIN USING A JOINT LEARNING APPROACH
Log in to post comments

Dereverberation is often performed in the time-frequency domain using mostly deep learning approaches. Time-frequency domain processing, however, may not be necessary when reverberation is modeled by the convolution operation. In this paper, we investigate whether deverberation can be effectively performed in the frequency-domain by estimating the complex frequency response of a room impulse response. More specifically, we develop a joint learning framework that uses frequency-domain estimates of the late reverberant response to assist with estimating the direct and early response.

GRACE_ICASSP2020.v4.pdf

GRACE_ICASSP2020.v4.pdf (407)

Categories:: Speech Enhancement (SPE-ENHA)

35 Views

Detect insider attacks using CNN in Decentralized Optimization

Read more about Detect insider attacks using CNN in Decentralized Optimization
Log in to post comments

This paper studies the security issue of a gossip-based distributed projected gradient (DPG) algorithm, when it is applied for solving a decentralized multi-agent optimization. It is known that the gossip-based DPG algorithm is vulnerable to insider attacks because each agent locally estimates its (sub)gradient without any supervision. This work leverages the convolutional neural network (CNN) to perform the detection and localization of the insider attackers.

3410.pdf

Detect insider attacks using CNN in Decentralized Optimization (391)

Categories:: Other applications of machine learning (MLR-APPL)

25 Views

A Deep Learning Architecture for Epileptic Seizure Classification Based on Object and Action Recognition

Epilepsy affects approximately 1% of the world’s population. Semiology of epileptic seizures contain major clinical signs to classify epilepsy syndromes currently evaluated by epileptologists by simple visual inspection of video. There is a necessity to create automatic and semiautomatic methods for seizure detection and classification to better support patient monitoring management and diagnostic decisions. One of the current promising approaches are the marker-less computer-vision techniques.

ICASSP2020_Karacsony.pdf

ICASSP2020 slides (1106)

Categories:: Pattern recognition and classification (MLR-PATT)

33 Views

Read more about ICASSP 2020
Log in to post comments

We address the problem of detection, in the frequency domain, of a M-dimensional time series modeled as the output of a M × K MIMO filter driven by a K-dimensional Gaussian white noise, and disturbed by an additive M-dimensional Gaussian col- ored noise. We consider the study of test statistics based of the Spectral Coherence Matrix (SCM) obtained as renormalization of the smoothed periodogram matrix of the observed time series over N samples, and with smoothing span B.

ICASSP_2020_pr_sentation.pdf

ICASSP_2020_pr_sentation.pdf (486)

Categories:: Statistical Signal Processing

64 Views

Dr.

Source separation with weakly labelled data An approach to computational auditory scene analysis_v1.1.pdf

Source separation with weakly labelled data An approach to computational auditory scene analysis_v1.1.pdf (828)

Categories:: Audio and Acoustic Signal Processing

17 Views

State-space Gaussian Process for Drift Estimation in Stochastic Differential Equations

This paper is concerned with the estimation of unknown drift functions of stochastic differential equations (SDEs) from observations of their sample paths. We propose to formulate this as a non-parametric Gaussian process regression problem and use an Itô-Taylor expansion for approximating the SDE. To address the computational complexity problem of Gaussian process regression, we cast the model in an equivalent state-space representation, such that (non-linear) Kalman filters and smoothers can be used.

slide.pdf

slide.pdf (422)

Categories:: Bayesian learning; Bayesian signal processing (MLR-BAYL)
Signal and System Modeling, Representation and Estimation

17 Views

Interactive Low Latency Video Streaming Of Volumetric Content

Read more about Interactive Low Latency Video Streaming Of Volumetric Content
Log in to post comments

Low latency video streaming of volumetric content is an emerging technology to enable immersive media experiences on mobile devices. Unlike 3DoF scenarios where users are restricted to changes of their head orientation at a single position, volumetric content allows users to move freely within the scene in 6DoF. Although the processing power of mobile devices has increased considerably, streaming volumetric content directly to such devices is still challenging. High-quality volumetric content requires significant data rate and network bandwidth.

icassp2020_podborski.pdf

icassp2020_podborski.pdf (671)

Categories:: Multimedia communications and networking
Virtual reality and 3D imaging

282 Views

ICASSP 2020

Pages