ICASSP 2021

ICASSP 2021 - IEEE International Conference on Acoustics, Speech and Signal Processing is the world’s largest and most comprehensive technical conference focused on signal processing and its applications. The ICASSP 2021 conference will feature world-class presentations by internationally renowned speakers, cutting-edge session topics and provide a fantastic opportunity to network with like-minded professionals from around the world. Visit website.

NON-PARALLEL MANY-TO-MANY VOICE CONVERSION BY KNOWLEDGE TRANSFER FROM A TEXT-TO-SPEECH MODEL

5305.pdf

5305.pdf (338)

Categories:: Speech Synthesis and Generation, including TTS (SPE-SYNT)

18 Views

Context-Aware Prosody Correction for Text-Based Speech Editing

Read more about Context-Aware Prosody Correction for Text-Based Speech Editing
Log in to post comments

Text-based speech editors expedite the process of editing speech recordings by permitting editing via intuitive cut, copy, and paste operations on a speech transcript. A major drawback of current systems, however, is that edited recordings often sound unnatural because of prosody mismatches around edited regions. In our work, we propose a new context-aware method for more natural sounding text-based editing of speech.

icassp-2021-poster.pdf

icassp-2021-poster.pdf (333)

icassp_2021_context-aware.pdf

icassp_2021_context-aware.pdf (294)

Categories:: Speech Synthesis and Generation, including TTS (SPE-SYNT)

20 Views

SAFE SCREENING FOR SPARSE REGRESSION WITH THE KULLBACK-LEIBLER DIVERGENCE

Read more about SAFE SCREENING FOR SPARSE REGRESSION WITH THE KULLBACK-LEIBLER DIVERGENCE
Log in to post comments

Safe screening rules are powerful tools to accelerate iterative solvers in sparse regression problems. They allow early identification of inactive coordinates (i.e., those not belonging to the support of the solution) which can thus be screened out in the course of iterations. In this paper, we extend the GAP Safe screening rule to the L1-regularized Kullback-Leibler divergence which does not fulfill the regularity assumptions made in previous works. The proposed approach is experimentally validated on synthetic and real count data sets.

2021-06_KL-Screening_ICASSP_CFDantas.pdf

Presentation slides (419)

ICASSP2021_poster.pdf

Poster (436)

Categories:: Signal and System Modeling, Representation and Estimation
Sampling and Reconstruction

68 Views

MILLIMETER WAVE MIMO CHANNEL ESTIMATION WITH 1-BIT SPATIAL SIGMA-DELTA ANALOG-TO-DIGITAL CONVERTERS

This paper focuses on channel estimation for mmWave MIMO systems with 1-bit spatial sigma-delta analog-to-digital converters (ADCs) and digital-to-analog converters (DACs). The channel estimation performance with 1-bit spatial sigma-delta modulators (i.e., ADCs or DACs) depends on the quantization noise modeling. Therefore, we present a new method for modeling the quantization noise by leveraging the deterministic input-output relation of the 1-bit spatial sigma-delta modulator.

Prasobh_ICASSP21_slides.pdf

Presentation slides (13523)

Prasobh_ICASSP21_poster.pdf

Poster (485)

Categories:: MIMO Communications and Signal Processing

26 Views

Parallel waveform synthesis based on generative adversarial networks with voicing-aware conditional discriminators

This paper proposes voicing-aware conditional discriminators for Parallel WaveGAN-based waveform synthesis systems. In this framework, we adopt a projection-based conditioning method that can significantly improve the discriminator's performance. Furthermore, the conventional discriminator is separated into two waveform discriminators for modeling voiced and unvoiced speech.

20210421_icassp2021_pwgvuvd_v4.pdf

20210421_icassp2021_pwgvuvd_v4.pdf (494)

Categories:: Speech Synthesis and Generation, including TTS (SPE-SYNT)

20 Views

Identification of Uterine Contractions by An Ensemble of Gaussian Processes

Read more about Identification of Uterine Contractions by An Ensemble of Gaussian Processes
Log in to post comments

Identifying uterine contractions with the aid of machine learning methods is necessary vis-á-vis their use in combination with fetal heart rates and other clinical data for the assessment of a fetus wellbeing. In this paper, we study contraction identification by processing noisy signals due to uterine activities. We propose a complete four-step method where we address the imbalanced classification problem with an ensemble Gaussian process classifier, where the Gaussian process latent variable model is used as a decision-maker.

UC identification.pdf

UC identification.pdf (365)

Categories:: Biomedical signal processing

5 Views

Class-imbalanced classifiers using ensembles of Gaussian processes and Gaussian process latent variable models

Classification with imbalanced data is a common and challenging problem in many practical machine learning problems. Ensemble learning is a popular solution where the results from multiple base classifiers are synthesized to reduce the effect of a possibly skewed distribution of the training set. In this paper, binary classifiers based on Gaussian processes are chosen as bases for inferring the predictive distributions of test latent variables. We apply a Gaussian process latent variable model where the outputs of the Gaussian processes are used for making the final decision.

EnGPC_GPLVM.pdf

EnGPC_GPLVM.pdf (277)

Categories:: Pattern recognition and classification (MLR-PATT)

17 Views

Antenna Selection For Massive MIMO Systems Based On POMDP Framework

Read more about Antenna Selection For Massive MIMO Systems Based On POMDP Framework
Log in to post comments

poster.pdf

poster.pdf (352)

Categories:: MIMO Communications and Signal Processing

12 Views

FUSION-BASED DIGITAL IMAGE CORRELATION FRAMEWORKFOR STRAIN MEASUREMENT

Read more about FUSION-BASED DIGITAL IMAGE CORRELATION FRAMEWORKFOR STRAIN MEASUREMENT
Log in to post comments

We address the problem of enabling two-dimensional digital image correlation (DIC) for strain measurement on large three-dimensional objects with curved surfaces. It is challenging to acquire full-field qualified images of the surface required by DIC due to geometric distortion and the narrow visual field of the surface that a single image can cover. To overcome this issue, we propose an end-to-end DIC framework incorporating the image fusion principle to achieve full-field strain measurement over the curved surface.

Crackdetection_icassp_2021.pdf

Here is the slides of our work (584)

Categories:: Sampling and Reconstruction

26 Views

Improved Data Selection for Domain Adaptation in ASR

Read more about Improved Data Selection for Domain Adaptation in ASR
Log in to post comments

Automatic speech recognition (ASR) systems are highly sensitive to train-test domain mismatch. However, because transcription is often prohibitively expensive, it is important to be able to make use of available transcribed out-of-domain data. We address the problem of domain adaptation with semi-supervised training (SST). Contrary to work in in-domain SST, we find significant performance improvement even with just one hour of target-domain data—though, the selection of the data is critical.

wotherspoon_poster_final.pdf

Poster_Improved_Data_Selection_for_Domain_Adaptation_in_ASR (341)

Categories:: General Topics in Speech Recognition (SPE-GASR)

20 Views

Pages