Sorry, you need to enable JavaScript to visit this website.

ICASSP 2022 - IEEE International Conference on Acoustics, Speech and Signal Processing is the world’s largest and most comprehensive technical conference focused on signal processing and its applications. The ICASSP 2022 conference will feature world-class presentations by internationally renowned speakers, cutting-edge session topics and provide a fantastic opportunity to network with like-minded professionals from around the world. Visit the website.

A spatial active noise control (ANC) method based on the individual kernel interpolation of primary and secondary sound fields is pro- posed. Spatial ANC is aimed at cancelling unwanted primary noise within a continuous region by using multiple secondary sources and microphones. A method based on the kernel interpolation of a sound field makes it possible to attenuate noise over the target region with flexible array geometry. Furthermore, by using the kernel function with directional weighting, prior information on primary noise source directions can be taken into consideration.

Categories:
89 Views

Named Entity Recognition (NER) from speech is among Spoken Language Understanding (SLU) tasks, aiming to extract semantic information from the speech signal. NER from speech is usually made through a two-step pipeline that consists of (1) processing the audio using an Automatic Speech Recognition (ASR) system and (2) applying an NER tagger to the ASR outputs. Recent works have shown the capability of the End-to-End (E2E) approach for NER from English and French speech, which is essentially entity-aware ASR.

Categories:
9 Views

Graph convolutional network (GCN) is a novel framework that utilizes a pre-defined Laplacian matrix to learn graph data effectively. With its powerful nonlinear fitting ability, GCN can produce high-quality node embedding. However, generalized GCN can only handle static graphs, whereas a large number of graphs are dynamic and evolve over time, which limits the application field of GCN. Facing the challenge, GCN with recurrent neural network (e.g., RNN) is naturally combined to acquire dynamic graph changes through joint training.

Categories:
9 Views

A leaderboard named Speech processing Universal PERformance Benchmark (SUPERB), which aims at benchmarking the performance of a shared self-supervised learning (SSL) speech model across various downstream speech tasks with minimal modification of architectures and a small amount of data, has fueled the research for speech representation learning. The SUPERB demonstrates speech SSL upstream models improve the performance of various downstream tasks through just minimal adaptation.

Categories:
7 Views

Automatic speaker verification (ASV), one of the most important technology for biometric identification, has been widely adopted in security-critical applications. However, ASV is seriously vulnerable to recently emerged adversarial attacks, yet effective countermeasures against them are limited. In this paper, we adopt neural vocoders to spot adversarial samples for ASV.

Categories:
17 Views

Fine-grained urban flow inference (FUFI) aims at enhancing the resolution of traffic flow, which plays an important role in intelligent traffic management. Existing FUFI methods are mainly based on techniques from image super-resolution (SR) models, which cannot fully capture the influence of external factors and face the ill-posed problem in SR tasks. In this paper, we propose UFI-Flow – Urban Flow Inference via normalizing Flow, a novel model for addressing the FUFI problem in a principled manner by using a single probabilistic loss.

Categories:
33 Views

Pages