- Read more about ICASSP_2022 Poster Ronchini
- Log in to post comments
This paper proposes a benchmark of submissions to Detection and Classification Acoustic Scene and Events 2021 Challenge (DCASE) Task 4 representing a sampling of the state-of-the-art in Sound Event Detection task. The submissions are evaluated according to the two polyphonic sound detection score scenarios proposed for the DCASE 2021 Challenge Task 4, which allow to make an analysis on whether submissions are designed to perform fine-grained temporal segmentation, coarse-grained temporal segmentation, or have been designed to be polyvalent on the scenarios proposed.
- Categories:
- Read more about UNSUPERVISED ANOMALY DETECTION FOR CONTAINER CLOUD VIA BILSTM-BASED VARIATIONAL AUTO-ENCODER
- Log in to post comments
- Categories:
Catastrophic forgetting is a thorny challenge when updating keyword spotting (KWS) models after deployment. To tackle such challenges, we propose a progressive continual learning strategy for small-footprint spoken keyword spotting (PCL-KWS). Specifically, the proposed PCL-KWS framework introduces a network instantiator to generate the task-specific sub-networks for remembering previously learned keywords. As a result, the PCL-KWS approach incrementally learns new keywords without forgetting prior knowledge.
- Categories:
- Read more about Causal Alignment Based Fault Root Causes Localization for Wireless Network
- Log in to post comments
- Categories:
- Read more about UNIFIED SPECULATION, DETECTION, AND VERIFICATION KEYWORD SPOTTING
- Log in to post comments
- Categories:
- Read more about FUSION OF MODULATION SPECTRAL AND SPECTRAL FEATURES WITH SYMPTOM METADATA FOR IMPROVED SPEECH-BASED COVID-19 DETECTION
- Log in to post comments
Existing speech-based coronavirus disease 2019 (COVID-19) detection systems provide poor interpretability and limited robustness to unseen data conditions. In this paper, we propose a system to overcome these limitations. In particular, we propose to fuse two different feature modalities with patient metadata in order to capture different properties of the disease. The first feature set is based on modulation spectral properties of speech. The second comprises spectral shape/descriptor features recently used for COVID-19 detection.
slides.pptx
- Categories:
- Read more about DATA INCUBATION — SYNTHESIZING MISSING DATA FOR HANDWRITING RECOGNITION
- Log in to post comments
In this paper, we demonstrate how a generative model can be used to build a better recognizer through the control of content and style. We are building an online handwriting recognizer from a modest amount of training samples. By training our controllable handwriting synthesizer on the same data, we can synthesize handwriting with previously underrepresented content (e.g., URLs and email addresses) and style (e.g., cursive and slanted). Moreover, we propose a framework to analyze a recognizer that is trained with a mixture of real and synthetic training data.
- Categories:
- Read more about Counting with Prediction: Rank and Select Queries with Adjusted Anchoring
- 1 comment
- Log in to post comments
Rank and select queries are the fundamental building blocks of the compressed data structures. On a given bit string of length n, counting the number of set bits up to a certain position is named as the rank, and finding the position of the kth set bit is the select query. We present a new data structure and the procedures on it to support rank/select operations.
- Categories:
Gagie and Nekrich (2009) gave an algorithm for adaptive prefix-free coding that, given a string $S [1..n]$ over an alphabet of size $\sigma = o (n / \log^{5 / 2} n)$, encodes $S$ in at most $n (H + 1) + o (n)$ bits, where $H$ is the empirical entropy of $S$, such that encoding and decoding $S$ take $O (n)$ time. They also proved their bound on the encoding length is optimal, even when the empirical entropy is high. Their algorithm is impractical, however, because it uses complicated data structures.
simple_slides.pdf
- Categories: