Sorry, you need to enable JavaScript to visit this website.

facebooktwittermailshare

Deep attractor networks for speaker re-identification and blind source separation

Abstract: 

Deep Clustering (DC) and Deep Attractor Networks (DANs) are a data-driven way to monaural blind source separation.
Both approaches provide astonishing single channel performance but have not yet been generalized to block-online processing.
When separating speech in a continuous stream with a block-online algorithm, it needs to be determined in each block which of the output streams belongs to whom.
In this contribution we solve this block permutation problem by introducing an additional speaker identification embedding to the DAN model structure.
We motivate this model decision by analyzing the embedding topology of DC and DANs and show, that DC and DANs themselves are not sufficient for speaker identification.
This model structure (a) improves the SDR over a DAN baseline and (b) provides up to 61 % and up to 34 % relative reduction in permutation error rate and re-identification error rate compared to an i-vector baseline, respectively.

up
0 users have voted:

Paper Details

Authors:
Lukas Drude, Thilo von Neumann, Reinhold Haeb-Umbach
Submitted On:
19 April 2018 - 7:00pm
Short Link:
Type:
Presentation Slides
Event:
Presenter's Name:
Lukas Drude
Paper Code:
2012
Document Year:
2018
Cite

Document Files

2018-04-17_drude.pdf

(82 downloads)

Subscribe

[1] Lukas Drude, Thilo von Neumann, Reinhold Haeb-Umbach, "Deep attractor networks for speaker re-identification and blind source separation", IEEE SigPort, 2018. [Online]. Available: http://sigport.org/3037. Accessed: Sep. 21, 2018.
@article{3037-18,
url = {http://sigport.org/3037},
author = {Lukas Drude; Thilo von Neumann; Reinhold Haeb-Umbach },
publisher = {IEEE SigPort},
title = {Deep attractor networks for speaker re-identification and blind source separation},
year = {2018} }
TY - EJOUR
T1 - Deep attractor networks for speaker re-identification and blind source separation
AU - Lukas Drude; Thilo von Neumann; Reinhold Haeb-Umbach
PY - 2018
PB - IEEE SigPort
UR - http://sigport.org/3037
ER -
Lukas Drude, Thilo von Neumann, Reinhold Haeb-Umbach. (2018). Deep attractor networks for speaker re-identification and blind source separation. IEEE SigPort. http://sigport.org/3037
Lukas Drude, Thilo von Neumann, Reinhold Haeb-Umbach, 2018. Deep attractor networks for speaker re-identification and blind source separation. Available at: http://sigport.org/3037.
Lukas Drude, Thilo von Neumann, Reinhold Haeb-Umbach. (2018). "Deep attractor networks for speaker re-identification and blind source separation." Web.
1. Lukas Drude, Thilo von Neumann, Reinhold Haeb-Umbach. Deep attractor networks for speaker re-identification and blind source separation [Internet]. IEEE SigPort; 2018. Available from : http://sigport.org/3037