ICASSP 2022 - IEEE International Conference on Acoustics, Speech and Signal Processing is the world’s largest and most comprehensive technical conference focused on signal processing and its applications. The ICASSP 2022 conference will feature world-class presentations by internationally renowned speakers, cutting-edge session topics and provide a fantastic opportunity to network with like-minded professionals from around the world. Visit the website.
- Read more about HYBRID ATTENTION-BASED PROTOTYPICAL NETWORKS FOR FEW-SHOT SOUND CLASSIFICATION
- Log in to post comments
In recent years, prototypical networks have been widely used
in many few-shot learning scenarios. However, as a metric-
based learning method, their performance often degrades in
the presence of bad or noisy embedded features, and outliers
in support instances. In this paper, we introduce a hybrid at-
tention module and combine it with prototypical networks for
few-shot sound classification. This hybrid attention module
consists of two blocks: a feature-level attention block, and
- Categories:
- Read more about Adversarially-Trained Nonnegative Matrix Factorization
- 1 comment
- Log in to post comments
- Categories:
- Read more about SPE-44.3: A MODEL FOR ASSESSOR BIAS IN AUTOMATIC PRONUNCIATION ASSESSMENT (POSTER)
- Log in to post comments
- Categories:
- Read more about SPE-44.3: A MODEL FOR ASSESSOR BIAS IN AUTOMATIC PRONUNCIATION ASSESSMENT (SLIDES)
- Log in to post comments
- Categories:
- Read more about Towards Low-Distortion Multi-Channel Speech Enhancement: The ESPNET-Se Submission to the L3DAS22 Challenge
- Log in to post comments
This paper describes our submission to the L3DAS22 Challenge Task 1, which consists of speech enhancement with 3D Ambisonic microphones. The core of our approach combines Deep Neural Network (DNN) driven complex spectral mapping with linear beamformers such as the multi-frame multi-channel Wiener filter. Our proposed system has two DNNs and a linear beamformer in between. Both DNNs are trained to perform complex spectral mapping, using a combination of waveform and magnitude spectrum losses.
- Categories:
- Read more about Conditional Diffusion Probabilistic Model for Speech Enhancement
- Log in to post comments
Speech enhancement is a critical component of many user-oriented audio applications, yet current systems still suffer from distorted and unnatural outputs. While generative models have shown strong potential in speech synthesis, they are still lagging behind in speech enhancement. This work leverages recent advances in diffusion probabilistic models, and proposes a novel speech enhancement algorithm that incorporates characteristics of the observed noisy speech signal into the diffusion and reverse processes.
- Categories:
- Read more about VarianceFlow: High-quality and Controllable Text-to-Speech Using Variance Information via Normalizing Flow
- Log in to post comments
- Categories:
- Read more about VarianceFlow: High-quality and Controllable Text-to-Speech Using Variance Information via Normalizing Flow
- Log in to post comments
- Categories:
- Read more about BALANCED STRIPE-WISE PRUNING IN THE FILTER
- Log in to post comments
- Categories:
- Read more about Longshen Ou ICASSP 2022 Exploring Transformer's Potential on Automatic Piano Transcription
- Log in to post comments
- Categories: