ICASSP 2022

ICASSP 2022 - IEEE International Conference on Acoustics, Speech and Signal Processing is the world’s largest and most comprehensive technical conference focused on signal processing and its applications. The ICASSP 2022 conference will feature world-class presentations by internationally renowned speakers, cutting-edge session topics and provide a fantastic opportunity to network with like-minded professionals from around the world. Visit the website.

Partially Fake Audio Detection by Self-attention-based Fake Span Discovery

Read more about Partially Fake Audio Detection by Self-attention-based Fake Span Discovery
Log in to post comments

The past few years have witnessed the significant advances of speech synthesis and voice conversion technologies. However, such technologies can undermine the robustness of broadly implemented biometric identification models and can be harnessed by in-the-wild attackers for illegal uses. The ASVspoof challenge mainly focuses on synthesized audios by advanced speech synthesis and voice conversion models, and replay attacks. Recently, the first Audio Deep Synthesis Detection challenge (ADD 2022) extends the attack scenarios into more aspects.

ADDposter-20220418.pdf

ADDposter-20220418.pdf (252)

Categories:: Speech Processing

10 Views

MODEL-BASED APPROACH FOR MEASURING THE FAIRNESS IN ASR

Read more about MODEL-BASED APPROACH FOR MEASURING THE FAIRNESS IN ASR
Log in to post comments

The issue of fairness arises when the automatic speech recognition (ASR) systems do not perform equally well for all subgroups of the population. In any fairness measurement studies for ASR, the open questions of how to control the confounding factors, how to handle unobserved heterogeneity across speakers, and how to trace the source of any word error rate (WER) gap among different subgroups are especially important - if not appropriately accounted for, incorrect conclusions will be drawn.

1229_Poster.pdf

1229_Poster.pdf (247)

Categories:: General Topics in Speech Recognition (SPE-GASR)

18 Views

Dual-branch Attention-In-Attention Transformer for single-channel speech enhancement

Read more about Dual-branch Attention-In-Attention Transformer for single-channel speech enhancement
Log in to post comments

Poster_ICASSP2022_dbaiat.pdf

Poster_ICASSP2022_dbaiat (361)

ICASSP2022_DBAIAT_slides.pdf

ICASSP2022_DBAIAT_slides.pdf (487)

Categories:: Speech Enhancement (SPE-ENHA)

27 Views

UNSUPERVISED ANOMALY DETECTION FOR CONTAINER CLOUD VIA BILSTM-BASED VARIATIONAL AUTO-ENCODER

icassp2022-poster-#2451.pdf

icassp2022-poster-#2451.pdf (349)

Categories:: Other

28 Views

TIME-DOMAIN AUDIO-VISUAL SPEECH SEPARATION ON LOW QUALITY VIDEOS

Read more about TIME-DOMAIN AUDIO-VISUAL SPEECH SEPARATION ON LOW QUALITY VIDEOS
Log in to post comments

Incorporating visual information is a promising approach to improve the performance of speech separation. Many related works have been conducted and provide inspiring results. However, low quality videos appear commonly in real scenarios, which may significantly degrade the performance of normal audio-visual speech separation system. In this paper, we propose a new structure to fuse the audio and visual features, which uses the audio feature to select relevant visual features by utilizing the attention mechanism.

poster.pdf

Poster (261)

presentation.pptx

Slides (241)

Categories:: Source Separation and Signal Enhancement
Multimodal signal processing

24 Views

DPT-FSNET: DUAL-PATH TRANSFORMER BASED FULL-BAND AND SUB-BAND FUSION NETWORK FOR SPEECH ENHANCEMENT

Poster_2333.pdf

Poster_2333.pdf (255)

Categories:: Speech Enhancement (SPE-ENHA)

28 Views

Progressive Continual Learning for Spoken Keyword Spotting

Read more about Progressive Continual Learning for Spoken Keyword Spotting
Log in to post comments

Catastrophic forgetting is a thorny challenge when updating keyword spotting (KWS) models after deployment. To tackle such challenges, we propose a progressive continual learning strategy for small-footprint spoken keyword spotting (PCL-KWS). Specifically, the proposed PCL-KWS framework introduces a network instantiator to generate the task-specific sub-networks for remembering previously learned keywords. As a result, the PCL-KWS approach incrementally learns new keywords without forgetting prior knowledge.

Progressive Continual Learning for Spoken Keyword Spotting.pdf

Progressive Continual Learning for Spoken Keyword Spotting.pdf (312)

Categories:: Other

23 Views

Recovery of Graph Signals from Sign Measurements

Read more about Recovery of Graph Signals from Sign Measurements
Log in to post comments

POSTER-ICASSP 2022.pdf

poster of paper 2748 for ICASSP 2022 (300)

RECOVERY OF GRAPH SIGNALS FROM SIGN MEASUREMENTS.pdf

presentation slides of paper 2748 for ICASSP 2022 (283)

Categories:: Sampling and Reconstruction

22 Views

Presentation Slides of TOWARDS CONTROLLABLE AND PHYSICAL INTERPRETABLE UNDERWATER SCENE SIMULATION

slides.pptx

slides.pptx (187)

Categories:: Image/Video Processing

6 Views

EXPLORING TRANSFERABILITY MEASURES AND DOMAIN SELECTION IN CROSS-DOMAIN SLOT FILLING

Read more about EXPLORING TRANSFERABILITY MEASURES AND DOMAIN SELECTION IN CROSS-DOMAIN SLOT FILLING
Log in to post comments

As an essential task for natural language understanding, slot filling aims to identify the contiguous spans of specific slots in an utterance. In real-world applications, the labeling costs of utterances may be expensive, and transfer learning techniques have been developed to ease this problem. However, cross-domain slot filling could significantly suffer from negative transfer due to non-targeted or zero-shot slots.

Slot-ICASSP2022-Poster.pdf

Poster (381)

Categories:: Other applications of machine learning (MLR-APPL)

27 Views

Pages