Sufficiency quantification for seamless text-independent speaker enrollment

Text-independent speaker recognition (TI-SR) requires a lengthy enrollment process that involves asking dedicated time from the user to create a reliable model of their voice. Seamless enrollment is a highly attractive feature which refers to the enrollment process that happens in the background and asks for no dedicated time from the user. One of the key problems in a fully automated seamless enrollment process is to determine the sufficiency of a given utterance collection for the purpose of TI-SR. No known metric exists in the literature to quantify sufficiency. This paper introduces a novel metric called phoneme-richness score. Quality of a sufficiency metric can be assessed via its correlation with the TI-SR performance. Our assessment shows that phoneme-richness score achieves -0.96 correlation with TI-SR performance (measured in equal error rate), which is highly significant, whereas a naive sufficiency metric like speech duration achieves only -0.68 correlation.

ICASSP2018_poster_Cilingir.pdf

Poster presented at ICASSP 2018 (754)

SufficiencyMetric_ICASP2018_Cilingir.pdf

Paper for ICASSP 2018 (778)

Thumbs Up

CITE

Documents

Poster

Sufficiency quantification for seamless text-independent speaker enrollment

ICASSP2018_poster_Cilingir.pdf

SufficiencyMetric_ICASP2018_Cilingir.pdf

QUESTIONS?