IEEE ICASSP 2024

IEEE ICASSP 2024 - IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) is the world’s largest and most comprehensive technical conference focused on signal processing and its applications. The IEEE ICASSP 2024 conference will feature world-class presentations by internationally renowned speakers, cutting-edge session topics and provide a fantastic opportunity to network with like-minded professionals from around the world. Visit the website.

MAML-Based 24-Hour Personalized Blood Pressure Estimation from Wrist Photoplethysmography Signals in Free-Living Context

Systolic blood pressure (BP) variability at daytime and nighttime, also known as BP dip, has shown its clinical value in diagnosis and treatment of cardiovascular diseases (CVDs). A model agnostic meta learning (MAML)-based 24-hour personalized approach is proposed in this paper for BP estimation using wrist photoplethysmography (PPG) signals from smart watch in free-living context.

PID8482_Poster.pdf

PID8482_Poster_ICASSP2024 (196)

Categories:: Biomedical signal processing

53 Views

Exploiting spatial attention mechanism for improved depth completion and feature fusion in novel view synthesis

Many image-based rendering (IBR) methods rely on depth estimates obtained from structured light or time-of-flight depth sensors to synthesize novel views from sparse camera networks. However, these estimates often contain missing or noisy regions, resulting in an incorrect mapping between source and target views. This situation makes the fusion process more challenging, as the visual information is misaligned, inconsistent, or missing.

Ban pdf.pdf

Ban pdf.pdf (224)

Categories:: Machine Learning for Signal Processing

20 Views

A BI-PYRAMID MULTIMODAL FUSION METHOD FOR THE DIAGNOSIS OF BIPOLAR DISORDERS

Read more about A BI-PYRAMID MULTIMODAL FUSION METHOD FOR THE DIAGNOSIS OF BIPOLAR DISORDERS
Log in to post comments

Previous research on the diagnosis of Bipolar disorder has mainly focused on resting-state functional magnetic resonance imaging. However, their accuracy can not meet the requirements of clinical diagnosis. Efficient multimodal fusion strategies have great potential for applications in multimodal data and can further improve the performance of medical diagnosis models. In this work, we utilize both sMRI and fMRI data and propose a novel multimodal diagnosis model for bipolar disorder.

ICASSP2024_BD.pptx

ICASSP2024_BD.pptx (189)

Categories:: Multimedia Signal Processing

16 Views

A Flexible Framework for Expectation Maximization-Based MIMO System Identification for Time-Variant Linear Acoustic Systems

Quasi-continuous system identification of time-variant linear acoustic systems can be applied in various audio signal processing applications when numerous acoustic transfer functions must be measured. A prominent application is measuring head-related transfer functions. We treat the underlying multiple-input-multiple-output (MIMO) system identification problem in a state-space model as a joint estimation problem for states, representing impulse responses, and state-space model parameters using the expectation maximization (EM) algorithm.

poster_11602.pdf

poster_11602.pdf (198)

Categories:: Audio and Acoustic Signal Processing

31 Views

Paper

Read more about Paper
Log in to post comments

Clusters of neurons generate electrical signals which propagate in all directions through brain tissue, skull, and scalp of different conductivity. Measuring these signals with electroencephalography (EEG) sensors placed on the scalp results in noisy data. This can have severe impact on estimation, such as, source localization and temporal response functions (TRFs). We hypothesize that some of the noise is due to a Wiener-structured signal propagation with both linear and nonlinear components.

paper_JohannaWilroth.pdf

paper_JohannaWilroth.pdf (231)

Categories:: Auditory Modeling and Hearing Aids

27 Views

Poster: Reversible Jump Markov chain Monte Carlo for Pulse Fitting

Read more about Poster: Reversible Jump Markov chain Monte Carlo for Pulse Fitting
Log in to post comments

This paper proposes a reversible jump Markov chain Monte Carlo method that provides efficient inference for the general problem of pulse fitting. In particular, it minimises the potential of an adopted parametric model overfitting to the (noisy) data via the inclusion of a peak proximity parameter. This facilitates learning a more representative underlying model and significantly reduces the computational cost. Synthetic and real data are used to demonstrate the efficacy of the introduced Bayesian technique.

Contact: afg30@cam.ac.uk

Final Poster.pdf

Final Poster.pdf (247)

Categories:: Bayesian learning; Bayesian signal processing (MLR-BAYL)

33 Views

Contextual Human Object Interaction Understanding From Pre-Trained Large Language Model

Existing human object interaction (HOI) detection methods have introduced zero-shot learning techniques to recognize unseen interactions, but they still have limitations in understanding context information and comprehensive reasoning. To overcome these limitations, we propose a novel HOI learning framework, ContextHOI, which serves as an effective contextual HOI detector to enhance contextual understanding and zero-shot reasoning ability. The main contributions of the proposed ContextHOI are a novel context-mining decoder and a powerful interaction reasoning large language model (LLM).

ContextHOI_ICASSP2024.pptx

ContextHOI_ICASSP2024.pptx (160)

Categories:: Image/Video Processing

21 Views

Deep Manifold Transformation for Protein Representation Learning

Read more about Deep Manifold Transformation for Protein Representation Learning
Log in to post comments

Protein representation learning is critical in various tasks in biology, such as drug design and protein structure or function prediction, which has primarily benefited from protein language models and graph neural networks. These models can capture intrinsic patterns from protein sequences and structures through masking and task-related losses. However, the learned protein representations are usually not well optimized, leading to performance degradation due to limited data, difficulty adapting to new tasks, etc.

poster_ICASSP2024_hubozhen.pdf

Poster of the paper. (206)

Categories:: Other

56 Views

Adaptive Multi-Exposure Fusion for Enhanced Neural Radiance Fields

Read more about Adaptive Multi-Exposure Fusion for Enhanced Neural Radiance Fields
Log in to post comments

Neural Radiance Fields (NeRF) have revolutionized 3D scene modeling and rendering. However, their performance dips when handling images with diverse exposure levels, mainly due to the intricate luminance dynamics. Addressing this, we present an innovative method that proficiently models and renders images across a spectrum of exposure conditions. Our approach utilizes an unsupervised classifier-generator structure for HDR fusion, significantly enhancing NeRF's ability to comprehend and adjust to light variations, leading to the generation of images with appropriate brightness.

poster_icassp2024.pdf

poster_icassp2024.pdf (267)

Categories:: Image, Video, and Multidimensional Signal Processing

53 Views

PRE-TRAINED ACOUSTIC-AND-TEXTUAL MODELING FOR END-TO-END SPEECH-TO-TEXT TRANSLATION

Read more about PRE-TRAINED ACOUSTIC-AND-TEXTUAL MODELING FOR END-TO-END SPEECH-TO-TEXT TRANSLATION
Log in to post comments

End-to-end paradigm has aroused more and more interests and attention for improving speech-to-text translation (ST) recently. Existing end-to-end models mainly attributes and attempts to address the problem of modeling burden and data scarcity, while always fail to maintain both cross-modal and cross-lingual mapping well at the same time.

icassp2024-PATM-poster.pptx

Poster of Paper SLP-P23.3 in ICASSP2024 (180)

Categories:: Machine Translation of Speech (SLP-SSMT)

33 Views

IEEE ICASSP 2024

Pages