Sorry, you need to enable JavaScript to visit this website.

IEEE ICASSP 2024 - IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) is the world’s largest and most comprehensive technical conference focused on signal processing and its applications. The IEEE ICASSP 2024 conference will feature world-class presentations by internationally renowned speakers, cutting-edge session topics and provide a fantastic opportunity to network with like-minded professionals from around the world. Visit the website.

Electroencephalogram (EEG) based sleep stage classification is very important in sleep quality analysis and the treatment of sleep disorders. Deep learning based automated sleep staging has achieved promising performance. However, it has not been widely adopted in clinical practice, due to the domain shift problem and insufficient labeled training data, especially for patients. To cope with these problems, this paper proposes a Transformer-based semi-supervised domain adaptation (SSDA) approach for EEG-based sleep stage classification.

Categories:
25 Views

Recent studies focus on developing efficient systems for acoustic scene classification (ASC) using convolutional neural networks (CNNs), which typically consist of consecutive kernels. This paper highlights the benefits of using separate kernels as a more powerful and efficient design approach in ASC tasks. Inspired by the time-frequency nature of audio signals, we propose TF-SepNet, a CNN architecture that separates the feature processing along the time and frequency dimensions. Features resulted from the separate paths are then merged by channels and directly forwarded to the classifier.

Categories:
19 Views

Auditory attention detection (AAD) based on electroencephalogram (EEG) helps recognize the target speaker in a cocktail party scenario, advancing auditory brain-computer interface development. Previous EEG studies on AAD were largely based on data collected in laboratory settings. In this study, we investigated the AAD with EEG data collected when subjects were walking and sitting in real-life scenarios. To improve the detection accuracy, we proposed the time-frequency attention mechanism to the convolution neural network on EEG data.

Categories:
34 Views

Large Vision Model (LVM) has recently demonstrated great potential for medical imaging tasks, potentially enabling image enhancement for sparse-view Cone-Beam Computed Tomography (CBCT), despite requiring a substantial amount of data for training. Meanwhile, Deep Image Prior (DIP) effectively guides an untrained neural network to generate high-quality CBCT images without any training data. How- ever, the original DIP method relies on a well-defined forward model and a large-capacity backbone network, which is no- toriously difficult to converge.

Categories:
37 Views

Accurate and automated gland segmentation on pathological images can assist pathologists in diagnosing the malignancy of colorectal adenocarcinoma. However, due to various gland shapes, severe deformation of malignant glands, and overlapping adhesions between glands. Gland segmentation has always been very challenging. To address these problems, we propose a DEA model. This model consists of two branches: the backbone encoding and decoding network and the local semantic extraction network.

Categories:
49 Views

Generative Adversarial Networks (GANs) have become widely used in model training, as they can improve performance and/or protect sensitive information by generating data. However, this also raises potential risks, as malicious GANs may compromise or sabotage models by poisoning their training data. Therefore, it is important to verify the origin of a model’s training data for accountability purposes. In this work, we take the first step in the forensic analysis of models trained on GAN-generated data. Specifically, we first detect whether a model is trained on GAN-generated or real data.

Categories:
18 Views

Music auto-tagging is crucial for enhancing music discovery and recommendation. Existing models in Music Information Retrieval (MIR) struggle with real-world noise such as environmental and speech sounds in multimedia content. This study proposes a method inspired by speech-related tasks to enhance music auto-tagging performance in noisy settings. The approach integrates Domain Adversarial Training (DAT) into the music domain, enabling robust music representations that withstand noise.

Categories:
30 Views

Amyotrophic Lateral Sclerosis (ALS) is a neurodegenerative disease that affects motor neurons and causes speech and respiratory dysfunctions. Current diagnostic methods are com- plicated, thus motivating the development of an efficient and objective diagnostic aid. We hypothesize that analyses of features capturing the essential characteristics of the biomechanical process of voice production can distinguish ALS patients from non-ALS controls. In this paper, we represent voices with algorithmically estimated vocal fold dynamics from physical models of phonation.

Categories:
16 Views

Permutation Entropy (PE) is a powerful nonlinear analysis technique for univariate time series. Recently, Permutation Entropy for Graph signals (PE_G) has been proposed to extend PE to data residing on irregular domains. However, PE_G is limited as it provides a single value to characterise a whole graph signal. Here, we introduce a novel approach to evaluate graph signals at the vertex level: graph-based permutation patterns. Synthetic datasets show the efficacy of our method.

Categories:
46 Views

Stress has notorious and debilitating effects on individuals and entire industries alike, with instances of stress continuing to rise post-pandemic. We investigate here (1) if the new technology of Vibroacoustic Sound Massage (VSM) has beneficial effects on user wellbeing and (2) if we can measure these effects based on the biosignal of speech prosody. Forty participants read a text before and after VSM treatment (45 min). The 80 readings were subjected to a multi-parameteric acoustic-prosodic analysis.

Categories:
26 Views

Pages