ICASSP 2022

ICASSP 2022 - IEEE International Conference on Acoustics, Speech and Signal Processing is the world’s largest and most comprehensive technical conference focused on signal processing and its applications. The ICASSP 2022 conference will feature world-class presentations by internationally renowned speakers, cutting-edge session topics and provide a fantastic opportunity to network with like-minded professionals from around the world. Visit the website.

DRVC: A Framework of Any-to-Any Voice Conversion with Self-Supervised Learning

Read more about DRVC: A Framework of Any-to-Any Voice Conversion with Self-Supervised Learning
Log in to post comments

DRVC-presentation.pdf

DRVC-presentation.pdf (258)

Categories:: Speech Synthesis and Generation, including TTS (SPE-SYNT)

16 Views

SPATIAL ACTIVE NOISE CONTROL BASED ON INDIVIDUAL KERNEL INTERPOLATION OF PRIMARY AND SECONDARY SOUND FIELDS

A spatial active noise control (ANC) method based on the individual kernel interpolation of primary and secondary sound fields is pro- posed. Spatial ANC is aimed at cancelling unwanted primary noise within a continuous region by using multiple secondary sources and microphones. A method based on the kernel interpolation of a sound field makes it possible to attenuate noise over the target region with flexible array geometry. Furthermore, by using the kernel function with directional weighting, prior information on primary noise source directions can be taken into consideration.

kazuyuki-arikawa_ICASSP_slide.pdf

kazuyuki-arikawa_ICASSP_slide.pdf (418)

kazuyuki-arikawa_ICASSP_poster.pdf

kazuyuki-arikawa_ICASSP_poster.pdf (377)

Categories:: Active Noise Control

101 Views

Language Adaptive Cross-lingual Speech Representation Learning with Sparse Sharing Sub-networks

poster_s3net_yzl_ICASSP2022.pdf

Poster (239)

ppt_s3net_yzl_ICASSP2022.pdf

Slides (263)

Categories:: Multilingual Recognition and Identification (SPE-MULT)

32 Views

Generation For Adaption: A GAN-Based Approach for Unsupervised Domain Adaption with 3D Point Cloud Data

Generation For Adaption Poster.pdf

Generation For Adaption Poster.pdf (280)

Categories:: Machine Learning for Signal Processing

19 Views

AISHELL-NER: Named Entity Recognition from Chinese Speech

Read more about AISHELL-NER: Named Entity Recognition from Chinese Speech
Log in to post comments

Named Entity Recognition (NER) from speech is among Spoken Language Understanding (SLU) tasks, aiming to extract semantic information from the speech signal. NER from speech is usually made through a two-step pipeline that consists of (1) processing the audio using an Automatic Speech Recognition (ASR) system and (2) applying an NER tagger to the ASR outputs. Recent works have shown the capability of the End-to-End (E2E) approach for NER from English and French speech, which is essentially entity-aware ASR.

Poster.pdf

Poster.pdf (1910)

Categories:: Audio and Acoustic Signal Processing

16 Views

Improving Dynamic Graph Convolutional Network with Fine-grained Attention Mechanism

Read more about Improving Dynamic Graph Convolutional Network with Fine-grained Attention Mechanism
Log in to post comments

Graph convolutional network (GCN) is a novel framework that utilizes a pre-defined Laplacian matrix to learn graph data effectively. With its powerful nonlinear fitting ability, GCN can produce high-quality node embedding. However, generalized GCN can only handle static graphs, whereas a large number of graphs are dynamic and evolve over time, which limits the application field of GCN. Facing the challenge, GCN with recurrent neural network (e.g., RNN) is naturally combined to acquire dynamic graph changes through joint training.

icassp-poster.pdf

icassp-poster.pdf (727)

Categories:: System-on-chip architectures for signal processing

12 Views

Characterizing the adversarial vulnerability of speech self-supervised learning

Read more about Characterizing the adversarial vulnerability of speech self-supervised learning
Log in to post comments

A leaderboard named Speech processing Universal PERformance Benchmark (SUPERB), which aims at benchmarking the performance of a shared self-supervised learning (SSL) speech model across various downstream speech tasks with minimal modification of architectures and a small amount of data, has fueled the research for speech representation learning. The SUPERB demonstrates speech SSL upstream models improve the performance of various downstream tasks through just minimal adaptation.

AttackSSL-poster.pdf

AttackSSL-poster.pdf (259)

Categories:: Speech Processing

8 Views

Adversarial sample detection for speaker verification by neural vocoders

Read more about Adversarial sample detection for speaker verification by neural vocoders
Log in to post comments

Automatic speaker verification (ASV), one of the most important technology for biometric identification, has been widely adopted in security-critical applications. However, ASV is seriously vulnerable to recently emerged adversarial attacks, yet effective countermeasures against them are limited. In this paper, we adopt neural vocoders to spot adversarial samples for ASV.

Vocoder-report.pdf

Vocoder-report.pdf (255)

Categories:: Speech Processing

21 Views

Joint magnitude estimation and phase recovery using Cycle-in-Cycle GAN for non-parallel speech enhancement

Poster_ICASSP2022_cincgan.pdf

Poster_ICASSP2022_cincgan.pdf (318)

ICASSP2022_CINCGAN_slides.pdf

ICASSP2022_CINCGAN_slides.pdf (236)

Categories:: Speech Enhancement (SPE-ENHA)

29 Views

Probabilistic Fine-grained Urban Flow Inference with Normalizing Flows

Read more about Probabilistic Fine-grained Urban Flow Inference with Normalizing Flows
Log in to post comments

Fine-grained urban flow inference (FUFI) aims at enhancing the resolution of traffic flow, which plays an important role in intelligent traffic management. Existing FUFI methods are mainly based on techniques from image super-resolution (SR) models, which cannot fully capture the influence of external factors and face the ill-posed problem in SR tasks. In this paper, we propose UFI-Flow – Urban Flow Inference via normalizing Flow, a novel model for addressing the FUFI problem in a principled manner by using a single probabilistic loss.

ICASSP_slides.pptx

slides (416)

ICASSP_5513_poster.pdf

poster (428)

Categories:: Knowledge and Data Engineering

35 Views

Pages