Speaker Recognition and Characterization (SPE-SPKR)

DISENTANGLED SPEAKER EMBEDDING FOR ROBUST SPEAKER VERIFICATION

Read more about DISENTANGLED SPEAKER EMBEDDING FOR ROBUST SPEAKER VERIFICATION
Log in to post comments

Entanglement of speaker features and redundant features may lead to poor performance when evaluating speaker verification systems on an unseen domain. To address this issue, we propose an InfoMax domain separation and adaptation network (InfoMax–DSAN) to disentangle the domain-specific features and domain-invariant speaker features based on domain adaptation techniques. A frame-based mutual information neural estimator is proposed to maximize the mutual information between frame-level features and input acoustic features, which can help retain more useful information.

slides.pdf

slides.pdf (149)

Categories:: Speaker Recognition and Characterization (SPE-SPKR)

8 Views

Robust speaker verification using Population-based Data Augmentation Poster

Read more about Robust speaker verification using Population-based Data Augmentation Poster
Log in to post comments

poster_icassp2022.pdf

poster_icassp2022.pdf (229)

Categories:: Speaker Recognition and Characterization (SPE-SPKR)

8 Views

Robust speaker verification using Population-based Data Augmentation

Read more about Robust speaker verification using Population-based Data Augmentation
Log in to post comments

population_sv_icassp2022.pdf

ppf file (141)

Categories:: Speaker Recognition and Characterization (SPE-SPKR)

4 Views

"Self-Supervised Speaker Recognition Training using Human-Machine Dialogues" Presentation

self_supervised_speaker_id_using_dialogs.pdf

Presentation of the paper titled "Self-Supervised Speaker Recognition Training using Human-Machine Dialogues" ICASSP 2022 (134)

Categories:: Speaker Recognition and Characterization (SPE-SPKR)

3 Views

ATTACK ON PRACTICAL SPEAKER VERIFICATION SYSTEM USING UNIVERSAL ADVERSARIAL PERTURBATIONS

5375slide.pdf

slides (264)

Categories:: Applications
Speaker Recognition and Characterization (SPE-SPKR)

18 Views

Short-time spectral aggregation for speaker embedding

Read more about Short-time spectral aggregation for speaker embedding
Log in to post comments

State-of-the-art speaker verification systems take frame-level acoustics features as input and produce fixed-dimensional embeddings as utterance-level representations. Thus, how to aggregate information from frame-level features is vital for achieving high performance. This paper introduces short-time spectral pooling (STSP) for better aggregation of frame-level information. STSP transforms the temporal feature maps of a speaker embedding network into the spectral domain and extracts the lowest spectral components of the averaged spectrograms for aggregation.