Documents
Poster
Sufficiency quantification for seamless text-independent speaker enrollment
- Citation Author(s):
- Submitted by:
- Gokcen Cilingir
- Last updated:
- 13 July 2018 - 3:38pm
- Document Type:
- Poster
- Document Year:
- 2018
- Event:
- Presenters:
- Gokcen Cilingir
- Paper Code:
- SP-P5.8
- Categories:
- Log in to post comments
Text-independent speaker recognition (TI-SR) requires a lengthy enrollment process that involves asking dedicated time from the user to create a reliable model of their voice. Seamless enrollment is a highly attractive feature which refers to the enrollment process that happens in the background and asks for no dedicated time from the user. One of the key problems in a fully automated seamless enrollment process is to determine the sufficiency of a given utterance collection for the purpose of TI-SR. No known metric exists in the literature to quantify sufficiency. This paper introduces a novel metric called phoneme-richness score. Quality of a sufficiency metric can be assessed via its correlation with the TI-SR performance. Our assessment shows that phoneme-richness score achieves -0.96 correlation with TI-SR performance (measured in equal error rate), which is highly significant, whereas a naive sufficiency metric like speech duration achieves only -0.68 correlation.