ICASSP 2023

IEEE ICASSP 2023 - IEEE International Conference on Acoustics, Speech and Signal Processing is the world’s largest and most comprehensive technical conference focused on signal processing and its applications. The ICASSP 2023 conference will feature world-class presentations by internationally renowned speakers, cutting-edge session topics and provide a fantastic opportunity to network with like-minded professionals from around the world. Visit the website.

MABNet: Master Assistant Buddy Network with Hybrid Learning for Image Retrieval

Read more about MABNet: Master Assistant Buddy Network with Hybrid Learning for Image Retrieval
Log in to post comments

Image retrieval has garnered a growing interest in recent times. The current approaches are either supervised or self-supervised. These methods do not exploit the benefits of hybrid learning using both supervision and self-supervision. We present a novel Master Assistant Buddy Network (MABNet) for image retrieval which incorporates both the learning mechanisms. MABNet consists of master and assistant block, both learning independently through supervision and collectively via self-supervision.

MABNET_ICASSP (6).pdf

MABNet (51)

Categories:: Image/Video Storage, Retrieval

8 Views

Reducing the Communication and Computational Cost of Random Fourier Features Kernel LMS in Diffusion Networks

Diffusion kernel algorithms are interesting tools for distributed nonlinear estimation. However, for the sake of feasibility, it is essential in practice to restrict their computational cost and the number of communications. In this paper, we propose a censoring algorithm for adaptive kernel diffusion networks based on random Fourier features that locally adapts the number of nodes censored according to the estimation error.

Slides_ICASSP_2023.pdf

Presentation Slides (Paper Code SPCN-L2.4) (90)

Categories:: Communication and Sensing aspects of Sensor Networks, Wireless and Ad-Hoc Networks
Graphical and kernel methods (MLR-GRKN)
Adaptive Signal Processing

17 Views

DYNAMIC SOURCE LOCALIZATION AND FUNCTIONAL CONNECTIVITY ESTIMATION WITH STATE-SPACE MODELS: PRELIMINARY FEASIBILITY ANALYSIS - Poster and Video

Dynamic imaging of source and functional connectivity (FC) using electroencephalographic (EEG) signals is essential for understanding the brain and cognition with sufficiently affordable technology to be widely applicable for studying changes associated with healthy ageing and the progression of neuropathology. We present an application for group analysis of recently developed state-space models and algorithms for simultaneously estimating the large-scale EEG inverse and FC problems.

JSB_ICASSP_2023.pdf

ICASSP2023_Poster (74)

Categories:: Signal and System Modeling, Representation and Estimation

103 Views

SPEECH-BASED EMOTION RECOGNITION WITH SELF-SUPERVISED MODELS USING ATTENTIVE CHANNEL-WISE CORRELATIONS AND LABEL SMOOTHING

When recognizing emotions from speech, we encounter two common problems: how to optimally capture emotion-relevant information from the speech signal and how to best quantify or categorize the noisy subjective emotion labels. Self-supervised pre-trained representations can robustly capture information from speech enabling state-of-the-art results in many downstream tasks including emotion recognition. However, better ways of aggregating the information across time need to be considered as the relevant emotion information is likely to appear piecewise and not uniformly across the signal.

poster_emotion_recognition_ssl_attentive_corr_icassp_2023.pdf

poster_emotion_recognition_ssl_attentive_corr_icassp_2023.pdf (59)

Categories:: Speech Analysis (SPE-ANLS)

38 Views

LA-VocE: Low-SNR Audio-visual Speech Enhancement using Neural Vocoders

Read more about LA-VocE: Low-SNR Audio-visual Speech Enhancement using Neural Vocoders
Log in to post comments

Audio-visual speech enhancement aims to extract clean speech from a noisy environment by leveraging not only the audio itself but also the target speaker's lip movements. This approach has been shown to yield improvements over audio-only speech enhancement, particularly for the removal of interfering speech. Despite recent advances in speech synthesis, most audio-visual approaches continue to use spectral mapping/masking to reproduce the clean audio, often resulting in visual backbones added to existing speech enhancement architectures.

paper.pdf

Link to paper (60)

poster.pdf

Link to poster (52)

slides.pdf

Link to slides (63)

Categories:: Speech Enhancement (SPE-ENHA)

33 Views

Residual Squeeze-and-Excitation U-shaped Network for Minutia Extraction in Contactless Fingerprint Images

Slides.pdf

Slides.pdf (514)

Categories:: Other

25 Views

Fast Robust Principle Component Analysis using Gauss-Newton Iterations

Read more about Fast Robust Principle Component Analysis using Gauss-Newton Iterations
Log in to post comments

Robust Principal Component Analysis (RPCA) is an optimization problem that decomposes a data matrix into a low-rank and a sparse matrix. However, solving this problem using alternating procedures requires sequentially computing singular value decompositions (SVDs) of large matrices, which is computationally expensive. In this work, we propose a computation protocol that leverages Gauss-Newton iterations to speed up the sequential computation of SVDs and accelerate the entire RPCA process.

Poster_Accelerated_RPCA_with_Gauss_Newton_ICASSP_2023v2.pdf

Poster_Accelerated_RPCA_with_Gauss_Newton_ICASSP_2023v2.pdf (86)

Categories:: Sampling and Reconstruction

18 Views

Network Topology Inference from Gaussian and Stationary Graph Signals

Read more about Network Topology Inference from Gaussian and Stationary Graph Signals
Log in to post comments

Graphs have become pervasive tools to represent information and datasets with irregular support. However, in many cases, the underlying graph is either unavailable or naively obtained, calling for more advanced methods for its estimation. Indeed, graph topology inference methods that estimate the network structure from a set of signal observations have a long and well-established history. By assuming that the observations are both Gaussian and stationary in the sought graph, this paper proposes a new scheme to learn the network from nodal observations.

Slides_Gauss_St_inf_video.pdf

Slides_ICASSP2023 (89)

ICASSP2023_Poster_Andrei.pdf

Poster ICASSP2023 (80)

Categories:: Statistical Signal Processing
Other

44 Views

Untrained graph neural networks for denoising

Read more about Untrained graph neural networks for denoising
Log in to post comments

A fundamental problem in signal processing is to denoise a signal. While there are many well-performing methods for denoising signals defined on regular domains, including images defined on a two-dimensional pixel grid, many important classes of signals are defined over irregular domains that can be conveniently represented by a graph. This paper introduces two untrained graph neural network architectures for graph signal denoising, develops theoretical guarantees for their denoising capabilities in a simple setup, and provides empirical evidence in more general scenarios.

ICASSP23_Untrained_GNN_Signal_Denoising_poster.pdf

Poster that will be presented at ICASSP 2023 (95)

ICASSP23-Untrained_GNN_Signal_Denoising_slides.pdf

Slides of the video introducing the work (115)

Categories:: Neural network learning (MLR-NNLR)

49 Views

Feature Selection and Text Embedding For Detecting Dementia from Spontaneous Cantonese

Dementia is a severe cognitive impairment that affects the health of older adults and creates a burden on their families and caretakers. This paper analyzes diverse hand-crafted features extracted from spoken languages and selects the most discriminative ones for dementia detection. Recently, the performance of dementia detection has been significantly improved by utilizing Transformer-based models that automatically capture the structural and linguistic properties of spoken languages. We investigate Transformer-based features and propose an end-to-end system for dementia detection.

ICASSP2023.pdf

ICASSP2023.pdf (92)

Categories:: Spoken Language Processing

26 Views

Pages