ICASSP 2022 - IEEE International Conference on Acoustics, Speech and Signal Processing is the world’s largest and most comprehensive technical conference focused on signal processing and its applications. The ICASSP 2022 conference will feature world-class presentations by internationally renowned speakers, cutting-edge session topics and provide a fantastic opportunity to network with like-minded professionals from around the world. Visit the website.
- Read more about Spatio-Temporal PRRS Epidemic Forecasting via Factorized Deep Generative Modeling
- Log in to post comments
Abstract:
- Categories:
- Read more about Provable Sample Complexity Guarantees for Learning of Continuous-Action Graphical Games with Nonparametric Utilities
- Log in to post comments
- Categories:
- Read more about Information Theoretic Limits for Standard and One-bit Compressed Sensing with Graph-structured Sparsity
- Log in to post comments
- Categories:
- Read more about A Method to Reveal Speaker Identity in Distributed ASR Training, and How to Counter It
- Log in to post comments
End-to-end Automatic Speech Recognition (ASR) models are commonly trained over spoken utterances using optimization methods like Stochastic Gradient Descent (SGD). In distributed settings like Federated Learning, model training requires transmission of gradients over a network. In this work, we design the first method for revealing the identity of the speaker of a training utterance with access only to a gradient.
- Categories:
- Read more about Exploring Heterogeneous Characteristics of Layers in ASR Models for More Efficient Training
- Log in to post comments
Transformer-based architectures have been the subject of research aimed at understanding their overparameterization and the non-uniform importance of their layers. Applying these approaches to Automatic Speech Recognition, we demonstrate that the state-of-the-art Conformer models generally have multiple ambient layers. We study the stability of these layers across runs and model sizes, propose that group normalization may be used without disrupting their formation, and examine their correlation with model weight updates in each layer.
- Categories:
- Read more about Exploring Heterogeneous Characteristics of Layers in ASR Models for More Efficient Training
- Log in to post comments
Transformer-based architectures have been the subject of research aimed at understanding their overparameterization and the non-uniform importance of their layers. Applying these approaches to Automatic Speech Recognition, we demonstrate that the state-of-the-art Conformer models generally have multiple ambient layers. We study the stability of these layers across runs and model sizes, propose that group normalization may be used without disrupting their formation, and examine their correlation with model weight updates in each layer.
- Categories:
- Read more about CRAMÉR-RAO BOUND AND ANTENNA SELECTION OPTIMIZATION FOR DUAL RADAR-COMMUNICATION DESIGN
- Log in to post comments
- Categories:
Variational nonlinear chirp mode decomposition (VNCMD) is a recently introduced method for nonlinear chirp signal decomposition that has aroused notable attention in various fields. One limiting aspect of the method is that its performance relies heavily on the setting of the bandwidth parameter.
- Categories:
- Read more about SKIPPING MEMORY LSTM FOR LOW-LATENCY REAL-TIME CONTINUOUS SPEECH SEPARATION
- Log in to post comments
Continuous speech separation for meeting pre-processing has recently become a focused research topic. Compared to the data in utterance-level speech separation, the meeting-style audio stream lasts longer, has an uncertain number of speakers. We adopt the time-domain speech separation method and the recently proposed Graph-PIT to build a super low-latency online speech separation model, which is very important for the real application. The low-latency time-domain encoder with a small stride leads to an extremely long feature sequence.
poster.pdf
- Categories:
- Read more about STATISTICAL, SPECTRAL AND GRAPH REPRESENTATIONS FOR VIDEO-BASED FACIAL EXPRESSION RECOGNITION IN CHILDREN
- Log in to post comments
- Categories: