IEEE ICASSP 2024

IEEE ICASSP 2024 - IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) is the world’s largest and most comprehensive technical conference focused on signal processing and its applications. The IEEE ICASSP 2024 conference will feature world-class presentations by internationally renowned speakers, cutting-edge session topics and provide a fantastic opportunity to network with like-minded professionals from around the world. Visit the website.

Semi-Supervised Domain Adaptation for Eeg-Based Sleep Stage Classification

Read more about Semi-Supervised Domain Adaptation for Eeg-Based Sleep Stage Classification
Log in to post comments

Electroencephalogram (EEG) based sleep stage classification is very important in sleep quality analysis and the treatment of sleep disorders. Deep learning based automated sleep staging has achieved promising performance. However, it has not been widely adopted in clinical practice, due to the domain shift problem and insufficient labeled training data, especially for patients. To cope with these problems, this paper proposes a Transformer-based semi-supervised domain adaptation (SSDA) approach for EEG-based sleep stage classification.

海报.pdf

海报.pdf (129)

Categories:: Biomedical signal processing

38 Views

TF-SepNet: An Efficient 1D Kernel Design in CNNs for Low-Complexity Acoustic Scene Classification

Recent studies focus on developing efficient systems for acoustic scene classification (ASC) using convolutional neural networks (CNNs), which typically consist of consecutive kernels. This paper highlights the benefits of using separate kernels as a more powerful and efficient design approach in ASC tasks. Inspired by the time-frequency nature of audio signals, we propose TF-SepNet, a CNN architecture that separates the feature processing along the time and frequency dimensions. Features resulted from the separate paths are then merged by channels and directly forwarded to the classifier.

ICASSP2024_Poster_1.pdf

ICASSP2024_Poster_1.pdf (170)

Categories:: Audio Processing Systems

23 Views

EEG-BASED FAST AUDITORY ATTENTION DETECTION IN REAL-LIFE SCENARIOS USING TIME-FREQUENCY ATTENTION MECHANISM

Auditory attention detection (AAD) based on electroencephalogram (EEG) helps recognize the target speaker in a cocktail party scenario, advancing auditory brain-computer interface development. Previous EEG studies on AAD were largely based on data collected in laboratory settings. In this study, we investigated the AAD with EEG data collected when subjects were walking and sitting in real-life scenarios. To improve the detection accuracy, we proposed the time-frequency attention mechanism to the convolution neural network on EEG data.

icassp海报5.pdf

icassp海报5.pdf (178)

Categories:: Biomedical signal processing

56 Views

STABLE OPTIMIZATION FOR LARGE VISION MODEL BASED DEEP IMAGE PRIOR IN CONE-BEAM CT RECONSTRUCTION

Large Vision Model (LVM) has recently demonstrated great potential for medical imaging tasks, potentially enabling image enhancement for sparse-view Cone-Beam Computed Tomography (CBCT), despite requiring a substantial amount of data for training. Meanwhile, Deep Image Prior (DIP) effectively guides an untrained neural network to generate high-quality CBCT images without any training data. How- ever, the original DIP method relies on a well-defined forward model and a large-capacity backbone network, which is no- toriously difficult to converge.

poster.pdf

poster BISP-P7.2 (208)

Categories:: Medical imaging
Medical image analysis

38 Views

GLAND SEGMENTATION VIA DUAL ENCODERS AND BOUNDARY-ENHANCED ATTENTION

Read more about GLAND SEGMENTATION VIA DUAL ENCODERS AND BOUNDARY-ENHANCED ATTENTION
Log in to post comments

Accurate and automated gland segmentation on pathological images can assist pathologists in diagnosing the malignancy of colorectal adenocarcinoma. However, due to various gland shapes, severe deformation of malignant glands, and overlapping adhesions between glands. Gland segmentation has always been very challenging. To address these problems, we propose a DEA model. This model consists of two branches: the backbone encoding and decoding network and the local semantic extraction network.

ICASSP_Lecture_4_09.pptx

ICASSP_Lecture_4_09.pptx (125)

Categories:: Other

51 Views

Detection and Attribution of Models Trained on Generated Data

Read more about Detection and Attribution of Models Trained on Generated Data
Log in to post comments

Generative Adversarial Networks (GANs) have become widely used in model training, as they can improve performance and/or protect sensitive information by generating data. However, this also raises potential risks, as malicious GANs may compromise or sabotage models by poisoning their training data. Therefore, it is important to verify the origin of a model’s training data for accountability purposes. In this work, we take the first step in the forensic analysis of models trained on GAN-generated data. Specifically, we first detect whether a model is trained on GAN-generated or real data.

ICASSP_poster.pdf

Poster Presentation (231)

Categories:: Information Forensics and Security

21 Views

MUSIC AUTO-TAGGING WITH ROBUST MUSIC REPRESENTATION LEARNED VIA DOMAIN ADVERSARIAL TRAINING

Music auto-tagging is crucial for enhancing music discovery and recommendation. Existing models in Music Information Retrieval (MIR) struggle with real-world noise such as environmental and speech sounds in multimedia content. This study proposes a method inspired by speech-related tasks to enhance music auto-tagging performance in noisy settings. The approach integrates Domain Adversarial Training (DAT) into the music domain, enabling robust music representations that withstand noise.

ICASSP2024.pdf

ICASSP2024.pdf (194)

Categories:: Music Signal Processing

33 Views

VOCAL FOLD DYNAMICS FOR AUTOMATIC DETECTION OF AMYOTROPHIC LATERAL SCLEROSIS FROM VOICE

Amyotrophic Lateral Sclerosis (ALS) is a neurodegenerative disease that affects motor neurons and causes speech and respiratory dysfunctions. Current diagnostic methods are com- plicated, thus motivating the development of an efficient and objective diagnostic aid. We hypothesize that analyses of features capturing the essential characteristics of the biomechanical process of voice production can distinguish ALS patients from non-ALS controls. In this paper, we represent voices with algorithmically estimated vocal fold dynamics from physical models of phonation.

icassp_poster.pdf

Poster for paper (317)

Categories:: Bioacoustics and Medical Acoustics

19 Views

Poster for "Graph-based permutation patterns for the analysis of task-related fMRI signals on DTI networks in mild cognitive impairment"

Permutation Entropy (PE) is a powerful nonlinear analysis technique for univariate time series. Recently, Permutation Entropy for Graph signals (PE_G) has been proposed to extend PE to data residing on irregular domains. However, PE_G is limited as it provides a single value to characterise a whole graph signal. Here, we introduce a novel approach to evaluate graph signals at the vertex level: graph-based permutation patterns. Synthetic datasets show the efficacy of our method.

Poster_ICASSP.pdf

Poster_ICASSP.pdf (402)

Categories:: Biomedical signal processing

49 Views

ASSESSING VIBROACOUSTIC SOUND MASSAGE THROUGH THE BIOSIGNAL OF HUMAN SPEECH: EVIDENCE OF IMPROVED WELLBEING

Stress has notorious and debilitating effects on individuals and entire industries alike, with instances of stress continuing to rise post-pandemic. We investigate here (1) if the new technology of Vibroacoustic Sound Massage (VSM) has beneficial effects on user wellbeing and (2) if we can measure these effects based on the biosignal of speech prosody. Forty participants read a text before and after VSM treatment (45 min). The 80 readings were subjected to a multi-parameteric acoustic-prosodic analysis.

ASSESSING VIBROACOUSTIC SOUND MASSAGE THROUGH THE BIOSIGNAL OF HUMAN SPEECH - EVIDENCE OF IMPROVED WELLBEING.pdf

ASSESSING VIBROACOUSTIC SOUND MASSAGE THROUGH THE BIOSIGNAL OF HUMAN SPEECH: EVIDENCE OF IMPROVED WELLBEING (152)

Categories:: Speech Perception and Psychoacoustics (SPE-SPER)

29 Views

IEEE ICASSP 2024

Pages