Sorry, you need to enable JavaScript to visit this website.

IEEE ICASSP 2024 - IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) is the world’s largest and most comprehensive technical conference focused on signal processing and its applications. The IEEE ICASSP 2024 conference will feature world-class presentations by internationally renowned speakers, cutting-edge session topics and provide a fantastic opportunity to network with like-minded professionals from around the world. Visit the website.

We consider the problem of learning smooth multivariate probability density functions. We invoke the canonical decomposition of multivariate functions and we show that if a joint probability density function admits a truncated Fourier series representation, then the classical univariate Fejér-Riesz Representation Theorem can be used for learning bona fide joint probability density functions. We propose a scalable, flexible, and direct framework for learning smooth multivariate probability density functions even from potentially incomplete datasets.

Categories:
50 Views

Continual learning, which aims to incrementally accumulate knowledge, has been an increasingly significant but challenging research topic for deep models that are prone to catastrophic forgetting. In this paper, we propose a novel replay-based continual learning approach in the context of class-incremental learning in acoustic scene classification, to classify audio recordings into an expanding set of classes that characterize the acoustic scenes. Our approach is improving both the modeling and memory selection mechanism via mutual information optimization in continual learning.

Categories:
16 Views

conferencing applications. We introduced a novel neural codec for low-bitrate speech coding at 6 kbit/s, with long 1 kbit/s redundancy, that also enhances speech by suppressing noise and reverberation. Transmitting large amounts of redundant information allows for speech reconstruction on the receiver side during severe packet loss – see ICASSP paper ID 7175: “Ultra low bitrate loss resilient neural speech enhancing codec”.

Categories:
57 Views

It is crucial to promptly diagnose potential Parkinson's disease (PD) patients in order to facilitate early treatment and prevent disease progression. In recent years, there has been growing interest in using facial expressions for in-vitro PD diagnosis due to the distinct "masked face" characteristics of PD patients and the cost-effectiveness of this approach. However, current facial expression-based PD diagnosis methods are hindered by limited training data on PD patients' facial expressions and weak prediction models.

Categories:
16 Views

In this paper, we investigate cross-lingual learning (CLL) for multilingual scene text recognition (STR). CLL transfers knowledge from one language to another. We aim to find the condition that exploits knowledge from high-resource languages for improving performance in low-resource languages. To do so, we first examine if two general insights about CLL discussed in previous works are applied to multilingual STR: (1) Joint learning with high- and low-resource languages may reduce performance on low-resource languages, and (2) CLL works best between typologically similar languages.

Categories:
21 Views

Exercise-induced fatigue resulting from physical activity can
be an early indicator of overtraining, illness, or other health
issues. In this article, we present an automated method for
estimating exercise-induced fatigue levels through the use of
thermal imaging and facial analysis techniques utilizing deep
learning models. Leveraging a novel dataset comprising over
400,000 thermal facial images of rested and fatigued users,
our results suggest that exercise-induced fatigue levels could

Categories:
21 Views

Space debris detection and tracking, a key enabler for Space Situational Awareness (SSA), poses two inherent challenges: (1) small-sized targets (e.g., $1-10~cm$) posing detection difficulties for conventional ground-based radars (GBRs) and optical measurements; (2) large number resulting in a costly tracking exercise. To address these, this work utilizes intersatellite link (ISL) in the emerging low earth orbit (LEO) constellations to opportunistically sense debris. The spatially dense-distributed debris is modeled as a cluster to reduce the number of quantities estimated.

Categories:
20 Views

Recently, deep learning methods have shown promising results in point cloud compression. However, previous octree-based approaches either lack sufficient context or have high decoding complexity (e.g. > 900s). To address this problem, we propose a sufficient yet efficient context model and design an efficient deep learning codec for point clouds. Specifically, we first propose a segment-constrained multi-group coding strategy to exploit the autoregressive context while maintaining decoding efficiency.

Categories:
19 Views

The development of gene sequencing technology sparks an explosive growth of gene data. Thus, the storage of gene data has become an important issue. Recently, researchers begin to investigate deep learning-based gene data compression, which outperforms general traditional methods. In this paper, we propose a transformer-based gene compression method named GeneFormer. Specifically, we first introduce a modified transformer encoder with latent array to eliminate the dependency of the nucleotide sequence.

Categories:
35 Views

Modern social media platforms play an important role in facilitating rapid dissemination of information through their massive user networks. Fake news, misinformation, and unverifiable facts on social media platforms propagate disharmony and affect society. In this paper, we consider the problem of misinformation detection which classify news items as fake or real. Specifically, driven by experiential studies on real-world social media platforms, we propose a probabilistic Markovian information spread model over networks modeled by graphs.

Categories:
18 Views

Pages