![](https://sigport.org/sites/default/files/styles/medium/public/icassp24_logo.jpg?itok=fMlObe3v)
IEEE ICASSP 2024 - IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) is the world’s largest and most comprehensive technical conference focused on signal processing and its applications. The IEEE ICASSP 2024 conference will feature world-class presentations by internationally renowned speakers, cutting-edge session topics and provide a fantastic opportunity to network with like-minded professionals from around the world. Visit the website.
![](https://sigport.org/sites/default/files/styles/home/public/icassp24_logo_0.jpg?itok=OGpw2wC4)
- Read more about THE USTC SYSTEM FOR CADENZA 2024 CHALLENGE
- Log in to post comments
This paper reports our submission to the ICASSP 2024 Cadenza Challenge, focusing on the non-causal system. The challenge aims to develop a signal processing system that enables personalized rebalancing of music to improve the listening experience for individuals with hearing loss when they listen to music via their hearing aids. The system is based on the Hybrid Demucs model. We fine-tuned the baseline model on the given dataset with a multi-target strategy and added a mixture of information in the downmixing stage.
- Categories:
![](https://sigport.org/sites/default/files/styles/home/public/icassp24_logo_0.jpg?itok=OGpw2wC4)
- Read more about SCORE CALIBRATION BASED ON CONSISTENCY MEASURE FACTOR FOR SPEAKER VERIFICATION
- Log in to post comments
This paper proposes a new scoring calibration method named ``Consistency-Aware Score Calibration", which introduces a Consistency Measure Factor (CMF) to measure the stability of audio voiceprints in similarity scores for speaker verification. The CMF is inspired by the limitations in segment scoring, where the segments with shorter length are not friendly to calculate the similarity score.
- Categories:
![](https://sigport.org/sites/default/files/styles/list/public/ICASSP2024.png?itok=RxdqWY_y)
- Read more about Quantum Privacy Aggregation of Teacher Ensembles (QPATE) for Privacy-preserving Quantum Machine Learning
- Log in to post comments
The utility of machine learning has rapidly expanded in the last two decades and presented an ethical challenge. Papernot et. al. developed a technique, known as Private Aggregation of Teacher Ensembles (PATE) to enable federated learning in which multiple \emph{distributed teachers} are trained on disjoint data sets. This study is the first to apply PATE to an ensemble of quantum neural networks (QNN) to pave a new way of ensuring privacy in quantum machine learning (QML).
- Categories:
![](https://sigport.org/sites/default/files/styles/list/public/Fig3_0.png?itok=Q_qi6dA0)
- Read more about TCNAS: TRANSFORMER ARCHITECTURE EVOLVING IN CODE CLONE DETECTION
- Log in to post comments
Code clone detection aims at finding code fragments with syntactic or semantic similarity. Most of current approaches mainly focus on detecting syntactic similarity while ignoring semantic long-term context alignment, and these detection methods encode the source code using human-designed models, a process which requires both expert input and a significant cost of time for experimentation and refinement. To address these challenges, we introduce the Transformer Code Neural Architecture Search (TCNAS), an approach designed to optimize transformer-based architectures for detection.
- Categories:
![](https://sigport.org/sites/default/files/styles/home/public/icassp24_logo_0.jpg?itok=OGpw2wC4)
- Read more about [Slides] Multi-Sensor Multi-Scan Radar Sensing of Multiple Extended Targets
- Log in to post comments
We propose an efficient solution to the state estimation problem in multi-scan multi-sensor multiple extended target sensing scenarios. We first model the measurement process by a doubly inhomogeneous-generalized shot noise Cox process and then estimate the parameters using a jump Markov chain Monte Carlo sampling technique. The proposed approach scales linearly in the number of measurements and can take spatial properties of the sensors into account, herein, sensor noise covariance, detection probability, and resolution.
- Categories:
![](https://sigport.org/sites/default/files/styles/list/public/%E8%AF%AD%E4%B9%89%E6%B0%B4%E5%8D%B0%E6%A1%86%E5%9B%BE.png?itok=ZzBlhq8A)
- Read more about SEMANTIC SECURITY: A DIGITAL WATERMARK METHOD FOR IMAGE SEMANTIC PRESERVATION
- Log in to post comments
This is a poster on a proposed watermarking method. In the research, we first chose position vector domain instead of traditional spatial or frequency domain. In addition, we successfully implemented watermarking on semantic communication system. Third, we modeled watermarking channel so that researchers could systematically research watermarking process.
For more information, please check out the publication at IEEE Xplore:
- Categories:
![](https://sigport.org/sites/default/files/styles/list/public/POSTER_0.png?itok=ib1UwEoA)
- Read more about DELVING DEEPER INTO VULNERABLE SAMPLES IN ADVERSARIAL TRAINING
- Log in to post comments
Recently, vulnerable samples have been shown to be crucial
for improving adversarial training performance. Our analysis
on existing vulnerable samples mining methods indicate that
existing methods have two problems: 1) valuable connections
among different pairs of natural samples and their adversarial
counterparts are ignored; 2) parts of vulnerable samples are
unconsidered. To better leverage vulnerable samples, we propose INter PAir ConstrainT (INPACT) and Vulnerable Aware
- Categories:
![](https://sigport.org/sites/default/files/styles/list/public/icassp2024.png?itok=-OWxgm4Y)
- Read more about CROSS BRANCH FEATURE FUSION DECODER FOR CONSISTENCY REGULARIZATION-BASED SEMI-SUPERVISED CHANGE DETECTION
- Log in to post comments
Semi-supervised change detection (SSCD) utilizes partially labeled data and a large amount of unlabeled data to detect changes. However, the transformer-based SSCD network does not perform as well as the convolution-based SSCD network due to the lack of labeled data. To overcome this limitation, we introduce a new decoder called Cross Branch Feature Fusion (CBFF), which combines the strengths of both local convolutional branch and global transformer branch. The convolutional branch is easy to learn and can produce high-quality features with a small amount of labeled data.
- Categories:
![](https://sigport.org/sites/default/files/styles/home/public/icassp24_logo_0.jpg?itok=OGpw2wC4)
- Read more about 3D POSE ESTIMATION FROM MONOCULAR VIDEO WITH CAMERA-BONE ANGLE REGULARIZATION ON THE IMAGE FEATURE
- Log in to post comments
In this paper, we propose a monocular 3D pose estimation method which explicitly takes into account the angles between the camera optical axis and bones (camera-bone angles) as well as temporal information. The proposed method combines a 2D-to-3D-based method, which predicts a 3D pose from a sequence of 2D poses, and convolutional neural network (CNN) and includes novel regularization loss to enable the CNN to extract camera-bone-angle information.
- Categories:
![](https://sigport.org/sites/default/files/styles/list/public/multi_visual_0.jpeg?itok=meBAoo2f)
- Read more about Dynamic ASR pathways: An Adaptive Masking Approach Towards Efficient Pruning of a Multilingual ASR Model
- Log in to post comments
Neural network pruning offers an effective method for compressing a multilingual automatic speech recognition (ASR) model with minimal performance loss. However, it entails several rounds of pruning and re-training needed to be run for each language. In this work, we propose the use of an adaptive masking approach in two scenarios for pruning a multilingual ASR model efficiently, each resulting in sparse monolingual models or a sparse multilingual model (named as Dynamic ASR Pathways).
- Categories: