ICASSP 2023

IEEE ICASSP 2023 - IEEE International Conference on Acoustics, Speech and Signal Processing is the world’s largest and most comprehensive technical conference focused on signal processing and its applications. The ICASSP 2023 conference will feature world-class presentations by internationally renowned speakers, cutting-edge session topics and provide a fantastic opportunity to network with like-minded professionals from around the world. Visit the website.

Distributed Bayesian Tracking on the Special Euclidean Group using Lie Algebra Parametric Approximations

This paper proposes new distributed particle filters for tracking the state of a dynamic system that evolves on the Special Euclidean Group. The algorithms are based on the Random Exchange diffusion technique and build compressed parametric approximations to the particles using Lie algebras. Via numerical simulations, we observe that the proposed methods perform similarly to a centralized particle filter, surpassing an extended Kalman filter by a large margin.

icassp23_poster.pdf

icassp23_poster.pdf (179)

Categories:: Statistical Signal Processing

18 Views

Diffusion Particle Filtering on the Special Orthogonal Group Using Lie Algebra Statistics

In this paper, we introduce new distributed diffusion algorithms to track a sequence of hidden random matrices that evolve on the special orthogonal group. The algorithms are based on the Adapt-then-Combine and the Random Exchange methods, and diffuse Gaussian approximations of posterior densities computed in the Lie algebra of the special orthogonal group. Simulation results show that, in scenarios with nonlinear observation functions, the proposed algorithms perform closely to the centralized particle filter estimator and can outperform competing Extended Kalman Filters.

spl_poster.pdf

spl_poster.pdf (166)

Categories:: Statistical Signal Processing

24 Views

Cross-Lingual Transfer Learning for Alzheimer’s Detection from Spontaneous Speech

Read more about Cross-Lingual Transfer Learning for Alzheimer’s Detection from Spontaneous Speech
Log in to post comments

Alzheimer’s disease (AD) is a progressive neurodegenerative disease most often associated with memory deficits and cognitive decline. With the aging population, there has been much interest in automated methods for cognitive impairment detection. One approach that has attracted attention in recent years is AD detection through spontaneous speech. While the results are promising, it is not certain whether the learned speech features can be generalized across languages. To fill this gap, the ADReSS-M challenge was organized.

ICASSP_2023_ADReSS-M_TAMM_v1.pptx

Presentation: Submission for Multilingual Alzheimer's Detection (ADReSS-M) Challenge -- 2nd place (136)

2023048850.pdf

Paper: Submission for Multilingual Alzheimer's Detection (ADReSS-M) Challenge -- 2nd place (121)

Categories:: Other

17 Views

Ensemble and Personalized Transformer Models for Subject Identification and Relapse Detection in E-Prevention Challenge

In this short paper, we present the devised solutions for the subject identification and relapse detection tasks, which are part of the e-Prevention Challenge hosted at the ICASSP 2023 conference [1] [2] [3]. We specifically design an ensemble scheme of six models - five transformer-based ones and a CNN model - for the identification of subjects from wearable devices, while a personalized - one for each subject - scheme is used for relapse detection in psychotic disorder.

6877_slides.pdf

Presentation Slides (132)

Categories:: Biomedical signal processing

19 Views

CUSTOMIZED AUTOMATIC FACE BEAUTIFICATION

Read more about CUSTOMIZED AUTOMATIC FACE BEAUTIFICATION
Log in to post comments

In the age of social media, posting attractive mugshots is commonplace, leading to an urgent need for automatic facial beautification techniques. To better meet the esthetic preferences of users, we devise a customized automatic face beautification task that can retouch the face adaptively to match the user-entered target score whilst preserving the ID information as much as possible. To accomplish this task, we propose a Human Esthetics Guided StyleGAN Inversion method to retouch each face in the embedding space using StyleGAN inversion.

video.pptx

video.pptx (135)

Categories:: Image/Video Processing

24 Views

Rain Estimation Over a Region Using CycleGan

Read more about Rain Estimation Over a Region Using CycleGan
Log in to post comments

In the last couple of years, supervised machine learning (ML) methods have shown state-of-the-art results for near-ground rain estimation. Information is usually obtained from two kinds of sensors - rain gauges, which measure rain rate, and commercial microwave links (CMLs) which measure attenuation. These data sources are paired to create a dataset on which a model is trained.
The arising problem of such methods of training is in the need for the datasets to be constructed with a CML-rain gauge pairing relation.

Rain Estimation Over a Region Using CycleGan.pdf

Rain Estimation Over a Region Using CycleGan Research Manuscript (145)

Sagi_6763F.pdf

Rain Estimation Over a Region Using CycleGan Poster (147)

Categories:: Other applications of machine learning (MLR-APPL)

86 Views

Towards Realizing the Value of Labeled Target Samples: a Two-Stage Approach for Semi-Supervised Domain Adaptation

Semi-Supervised Domain Adaptation (SSDA) is a recently emerging research topic that extends from the widely-investigated Unsupervised Domain Adaptation (UDA) by further having a few target samples labeled, i.e., the model is trained with labeled source samples, unlabeled target samples as well as a few labeled} target samples.

ssda_icassp2023.pdf

ssda_icassp2023.pdf (127)

Categories:: Image/Video Processing

12 Views

ST360IQ: NO-REFERENCE OMNIDIRECTIONAL IMAGE QUALITY ASSESSMENT WITH SPHERICAL VISION TRANSFORMERS

Omnidirectional images, aka 360 images, can deliver immersive and interactive visual experiences. As their popularity has increased dramatically in recent years, evaluating the quality of 360 images has become a problem of interest since it provides insights for capturing, transmitting, and consuming this new media. However, directly adapting quality assessment methods proposed for standard natural images for omnidirectional data poses certain challenges. These models need to deal with very high-resolution data and implicit distortions due to the spherical form of the images.

ST360IQ_ NO-REFERENCE OMNIDIRECTIONAL IMAGE QUALITY ASSESSMENT WITH SPHERICAL VISION TRANSFORMERS.pdf

ST360IQ_ NO-REFERENCE OMNIDIRECTIONAL IMAGE QUALITY ASSESSMENT WITH SPHERICAL VISION TRANSFORMERS.pdf (176)

Categories:: Image/Video Processing

17 Views

Weighted Sampling For Masked Language Modeling

Read more about Weighted Sampling For Masked Language Modeling
Log in to post comments

Masked Language Modeling (MLM) is widely used to pretrain language models. The standard random masking strategy in MLM causes the pre-trained language models (PLMs) to be biased towards high-frequency tokens. Representation learning of rare tokens is poor and PLMs have limited performance on downstream tasks. To alleviate this frequency bias issue, we propose two simple and effective Weighted Sampling strategies for masking tokens based on token frequency and training loss. We apply these two strategies to BERT and obtain Weighted-Sampled BERT (WSBERT).

ICASSP2023-WeightSampling-short-for-video.pdf

ICASSP2023-WeightSampling-short-for-video.pdf (209)

Categories:: Spoken Language Processing

32 Views

A Database for Multi-Modal Short Video Quality Assessment

Read more about A Database for Multi-Modal Short Video Quality Assessment
Log in to post comments

The short video has gained increasing attention in information sharing and commercial promotions due to the fast development of social platforms. Accompanying, it introduces great requirements for assessing the quality of short videos for efficient information acquirement and propagation. However, existing video quality assessment researches focus on assessing video content with five rating scores, limiting the assessment to a one-dimension and simplified criterion.

6251_.pdf

Presentation slides for Paper#6251 "A Database for Multi-Modal Short Video Quality Assessment" (192)

Categories:: Quality Assessment

36 Views

Pages