Sorry, you need to enable JavaScript to visit this website.

IEEE ICASSP 2023 - IEEE International Conference on Acoustics, Speech and Signal Processing is the world’s largest and most comprehensive technical conference focused on signal processing and its applications. The ICASSP 2023 conference will feature world-class presentations by internationally renowned speakers, cutting-edge session topics and provide a fantastic opportunity to network with like-minded professionals from around the world. Visit the website.

Image retrieval has garnered a growing interest in recent times. The current approaches are either supervised or self-supervised. These methods do not exploit the benefits of hybrid learning using both supervision and self-supervision. We present a novel Master Assistant Buddy Network (MABNet) for image retrieval which incorporates both the learning mechanisms. MABNet consists of master and assistant block, both learning independently through supervision and collectively via self-supervision.

Categories:
8 Views

Diffusion kernel algorithms are interesting tools for distributed nonlinear estimation. However, for the sake of feasibility, it is essential in practice to restrict their computational cost and the number of communications. In this paper, we propose a censoring algorithm for adaptive kernel diffusion networks based on random Fourier features that locally adapts the number of nodes censored according to the estimation error.

Categories:
17 Views

Dynamic imaging of source and functional connectivity (FC) using electroencephalographic (EEG) signals is essential for understanding the brain and cognition with sufficiently affordable technology to be widely applicable for studying changes associated with healthy ageing and the progression of neuropathology. We present an application for group analysis of recently developed state-space models and algorithms for simultaneously estimating the large-scale EEG inverse and FC problems.

Categories:
103 Views

When recognizing emotions from speech, we encounter two common problems: how to optimally capture emotion-relevant information from the speech signal and how to best quantify or categorize the noisy subjective emotion labels. Self-supervised pre-trained representations can robustly capture information from speech enabling state-of-the-art results in many downstream tasks including emotion recognition. However, better ways of aggregating the information across time need to be considered as the relevant emotion information is likely to appear piecewise and not uniformly across the signal.

Categories:
38 Views

Audio-visual speech enhancement aims to extract clean speech from a noisy environment by leveraging not only the audio itself but also the target speaker's lip movements. This approach has been shown to yield improvements over audio-only speech enhancement, particularly for the removal of interfering speech. Despite recent advances in speech synthesis, most audio-visual approaches continue to use spectral mapping/masking to reproduce the clean audio, often resulting in visual backbones added to existing speech enhancement architectures.

Categories:
33 Views

Robust Principal Component Analysis (RPCA) is an optimization problem that decomposes a data matrix into a low-rank and a sparse matrix. However, solving this problem using alternating procedures requires sequentially computing singular value decompositions (SVDs) of large matrices, which is computationally expensive. In this work, we propose a computation protocol that leverages Gauss-Newton iterations to speed up the sequential computation of SVDs and accelerate the entire RPCA process.

Categories:
18 Views

Graphs have become pervasive tools to represent information and datasets with irregular support. However, in many cases, the underlying graph is either unavailable or naively obtained, calling for more advanced methods for its estimation. Indeed, graph topology inference methods that estimate the network structure from a set of signal observations have a long and well-established history. By assuming that the observations are both Gaussian and stationary in the sought graph, this paper proposes a new scheme to learn the network from nodal observations.

Categories:
44 Views

A fundamental problem in signal processing is to denoise a signal. While there are many well-performing methods for denoising signals defined on regular domains, including images defined on a two-dimensional pixel grid, many important classes of signals are defined over irregular domains that can be conveniently represented by a graph. This paper introduces two untrained graph neural network architectures for graph signal denoising, develops theoretical guarantees for their denoising capabilities in a simple setup, and provides empirical evidence in more general scenarios.

Categories:
49 Views

Dementia is a severe cognitive impairment that affects the health of older adults and creates a burden on their families and caretakers. This paper analyzes diverse hand-crafted features extracted from spoken languages and selects the most discriminative ones for dementia detection. Recently, the performance of dementia detection has been significantly improved by utilizing Transformer-based models that automatically capture the structural and linguistic properties of spoken languages. We investigate Transformer-based features and propose an end-to-end system for dementia detection.

Categories:
26 Views

Pages