ICASSP 2020

ICASSP is the world’s largest and most comprehensive technical conference focused on signal processing and its applications. The ICASSP 2020 conference will feature world-class presentations by internationally renowned speakers, cutting-edge session topics and provide a fantastic opportunity to network with like-minded professionals from around the world. Visit website.

Multi-level deep neural network adaptation for speaker verification using MMD and consistency regularization

Adapting speaker verification (SV) systems to a new environ- ment is a very challenging task. Current adaptation methods in SV mainly focus on the backend, i.e, adaptation is carried out after the speaker embeddings have been created. In this paper, we present a DNN-based adaptation method using maximum mean discrepancy (MMD). Our method exploits two important aspects neglected by previous research.

3043-LinMakLiSuYu.pdf

3043-LinMakLiSuYu.pdf (344)

Categories:: Speaker Recognition and Characterization (SPE-SPKR)

11 Views

PAN: Phoneme-Aware Network for Monaural Speech Enhancement

Read more about PAN: Phoneme-Aware Network for Monaural Speech Enhancement
Log in to post comments

PAN.pdf

PAN: Phoneme-Aware Network for Monaural Speech Enhancement (463)

Categories:: Speech Enhancement (SPE-ENHA)

28 Views

Information Maximized Variational Domain Adversarial Learning for Speaker Verification

Domain mismatch is a common problem in speaker ver- ification. This paper proposes an information-maximized variational domain adversarial neural network (InfoVDANN) to reduce domain mismatch by incorporating an InfoVAE into domain adversarial training (DAT). DAT aims to pro- duce speaker discriminative and domain-invariant features. The InfoVAE has two roles. First, it performs variational regularization on the learned features so that they follow a Gaussian distribution, which is essential for the standard PLDA backend.

5091-TuMakChien.pdf

5091-TuMakChien.pdf (396)

Categories:: Speaker Recognition and Characterization (SPE-SPKR)

21 Views

DENOISING OF EVENT-BASED SENSORS WITH SPATIAL-TEMPORAL CORRELATION

Read more about DENOISING OF EVENT-BASED SENSORS WITH SPATIAL-TEMPORAL CORRELATION
Log in to post comments

ICASSP2020-Wu.pdf

ICASSP2020-Wu.pdf (543)

Categories:: Bio-inspired multimedia systems and signal processing

66 Views

A LIGHTWEIGHT MULTI-LABEL SEGMENTATION NETWORK FOR MOBILE IRIS BIOMETRICS

Read more about A LIGHTWEIGHT MULTI-LABEL SEGMENTATION NETWORK FOR MOBILE IRIS BIOMETRICS
Log in to post comments

icassp_2020_2321.pdf

icassp_2020_2321.pdf (401)

Categories:: Biometrics

27 Views

Cross-domain Joint Dictionary Learning for ECG Reconstruction from PPG

Read more about Cross-domain Joint Dictionary Learning for ECG Reconstruction from PPG
Log in to post comments

An emerging research direction considers the inverse problem of inferring electrocardiogram (ECG) from photoplethysmogram (PPG) to bring about the synergy between the easy measurability of PPG and the rich clinical knowledge of ECG to facilitate preventive healthcare. Previous reconstruction using a universal basis has limited accuracy due to the lack of rich representative power. This paper proposes a cross-domain joint dictionary learning (XDJDL) framework to maximize the expressive power for the two cross-domain signals.

Xin_Tian_ICASSP2020.pdf

Presentation slides of ICASSP 2020 paper: "Cross-domain Joint Dictionary Learning (XDJDL) for ECG Reconstruction from PPG" (611)

Categories:: Biomedical signal processing

242 Views

Mask-dependent Phase Estimation for Monaural Speaker Separation

Read more about Mask-dependent Phase Estimation for Monaural Speaker Separation
Log in to post comments

Speaker Separation refers to isolating speech of interest in a multi-talker environment. Most methods apply real-valued Time-Frequency (T-F) masks to the mixture Short-Time Fourier Transform (STFT) to reconstruct the clean speech. Hence there is an unavoidable mismatch between the phase of the reconstruction and the original phase of the clean speech. In this paper, we propose a simple yet effective phase estimation network that predicts the phase of the clean speech based on a T-F mask predicted by a chimera++ network.

Slides_Mask-dependent Phase Estimation for Monaural Speaker Separation.pdf

Slides_Mask-dependent Phase Estimation for Monaural Speaker Separation.pdf (461)

Categories:: Source Separation and Signal Enhancement

15 Views

ICASSP2020 TEXT-INDEPENDENT SPEAKER VERIFICATION WITH ADVERSARIAL LEARNING ON SHORT UTTERANCES

A text-independent speaker verification system suffers severe performance degradation under short utterance condition. To address the problem, in this paper, we propose an adversarially learned embedding mapping model that directly maps a short embedding to an enhanced embedding with increased discriminability. In particular, a Wasserstein GAN with a bunch of loss criteria are investigated. These loss functions have distinct optimization objectives and some of them are less favoured for the speaker verification research area.

TEXT-INDEPENDENT SPEAKER VERIFICATION WITH ADVERSARIAL LEARNING ON SHORT UTTERANCES.pdf

Adversarial Learning, Speaker Recognition, Short utterances (363)

Categories:: Speaker Recognition and Characterization (SPE-SPKR)

81 Views

A Memory Augmented Architecture For Continuous Speaker Identification In Meetings

Read more about A Memory Augmented Architecture For Continuous Speaker Identification In Meetings
Log in to post comments

We introduce and analyze a novel approach to the problem of speaker identification in multi-party recorded meetings. Given a speech segment and a set of available candidate profiles, a data-driven approach is proposed learning the distance relations between them, aiming at identifying the correct speaker label corresponding to that segment. A recurrent, memory-based architecture is employed, since this class of neural networks has been shown to yield improved performance in problems requiring relational reasoning.

2020_ICASSP_RMC_MSR_pres.pdf

2020_ICASSP_RMC_MSR_pres.pdf (303)

Categories:: Speaker Recognition and Characterization (SPE-SPKR)

38 Views

ON THROUGHPUT OF MILLIMETER WAVE MIMO SYSTEMS WITH LOW RESOLUTION ADCS

Read more about ON THROUGHPUT OF MILLIMETER WAVE MIMO SYSTEMS WITH LOW RESOLUTION ADCS
Log in to post comments

Use of low resolution analog to digital converters (ADCs) is an effective way to reduce the high power consumption of millimeter wave (mmWave) receivers. In this paper, a receiver with low resolution ADCs based on adaptive thresholds is considered in downlink mmWave communications in which the channel state information is not known a-priori and acquired through channel estimation. A performance comparison of low-complexity algorithms for power and ADC allocation among transmit and receive terminals, respectively, is provided.

Presentation.pdf

Presentation.pdf (468)

Categories:: Communications and Networking

26 Views

Pages