Sorry, you need to enable JavaScript to visit this website.

facebooktwittermailshare

Fully Supervised Speaker Diarization

Abstract: 

In this paper, we propose a fully supervised speaker diarization approach, named unbounded interleaved-state recurrent neural networks (UIS-RNN). Given extracted speaker-discriminative embeddings (a.k.a. d-vectors) from input utterances, each individual speaker is modeled by a parameter-sharing RNN, while the RNN states for different speakers interleave in the time domain. This RNN is naturally integrated with a distance-dependent Chinese restaurant process (ddCRP) to accommodate an unknown number of speakers. Our system is fully supervised and is able to learn from examples where time-stamped speaker labels are annotated. We achieved a 7.6% diarization error rate on NIST SRE 2000 CALLHOME, which is better than the state-of-the-art method using spectral clustering. Moreover, our method decodes in an online fashion while most state-of-the-art systems rely on offline clustering.

up
0 users have voted:

Paper Details

Authors:
Aonan Zhang, Quan Wang, Zhenyao Zhu, John Paisley, Chong Wang
Submitted On:
24 April 2019 - 11:06am
Short Link:
Type:
Poster
Event:
Presenter's Name:
Quan Wang
Paper Code:
1112
Document Year:
2019
Cite

Document Files

icassp2019_supervised_diarization_poster.pdf

(89)

Subscribe

[1] Aonan Zhang, Quan Wang, Zhenyao Zhu, John Paisley, Chong Wang, " Fully Supervised Speaker Diarization", IEEE SigPort, 2019. [Online]. Available: http://sigport.org/3897. Accessed: Nov. 17, 2019.
@article{3897-19,
url = {http://sigport.org/3897},
author = {Aonan Zhang; Quan Wang; Zhenyao Zhu; John Paisley; Chong Wang },
publisher = {IEEE SigPort},
title = { Fully Supervised Speaker Diarization},
year = {2019} }
TY - EJOUR
T1 - Fully Supervised Speaker Diarization
AU - Aonan Zhang; Quan Wang; Zhenyao Zhu; John Paisley; Chong Wang
PY - 2019
PB - IEEE SigPort
UR - http://sigport.org/3897
ER -
Aonan Zhang, Quan Wang, Zhenyao Zhu, John Paisley, Chong Wang. (2019). Fully Supervised Speaker Diarization. IEEE SigPort. http://sigport.org/3897
Aonan Zhang, Quan Wang, Zhenyao Zhu, John Paisley, Chong Wang, 2019. Fully Supervised Speaker Diarization. Available at: http://sigport.org/3897.
Aonan Zhang, Quan Wang, Zhenyao Zhu, John Paisley, Chong Wang. (2019). " Fully Supervised Speaker Diarization." Web.
1. Aonan Zhang, Quan Wang, Zhenyao Zhu, John Paisley, Chong Wang. Fully Supervised Speaker Diarization [Internet]. IEEE SigPort; 2019. Available from : http://sigport.org/3897