Sorry, you need to enable JavaScript to visit this website.

Self-supervised Speaker Verification Employing a Novel Clustering Algorithm

DOI:
10.60864/ynwb-az32
Citation Author(s):
Submitted by:
Abderrahim Fathan
Last updated:
6 June 2024 - 10:28am
Document Type:
Poster
Document Year:
2024
Event:
Presenters:
Abderrahim Fathan
Paper Code:
9492
 

Clustering is an unsupervised learning technique, which leverages a large amount of unlabeled data to learn cluster-wise representations from speech. One of the most popular self-supervised techniques to train a speaker verification system is to predict the pseudo-labels using clustering algorithms and then train the speaker embedding network using the generated pseudo-labels in a discriminative manner. Therefore, pseudo-labels - driven self-supervised speaker verification systems' performance relies heavily on the accuracy of the adopted clustering algorithms. In this contribution, we propose a novel clustering technique that not only (i) combines predictions of augmented samples to provide a complementary supervisory signal for clustering and imposes symmetry within the augmentations but also (ii) enforces representation invariance via Self-Augmented Training (SAT) and maximizes the information-theoretic dependency between samples and their predicted pseudo-labels.
Experimental results on the VoxCeleb dataset show that the proposed clustering framework achieves better clustering performance in terms of a variety of clustering metrics. Proposed framework is also able to provide better self-supervised speaker verification performance than the state-of-the-art approaches trained on the same dataset.

up
0 users have voted: