Sorry, you need to enable JavaScript to visit this website.

facebooktwittermailshare

CBLDNN-BASED SPEAKER-INDEPENDENT SPEECH SEPARATION VIA GENERATIVE ADVERSARIAL TRAINING

Abstract: 

In this paper, we propose a speaker-independent multi-speaker monaural speech separation system (CBLDNN-GAT) based on convolutional, bidirectional long short-term memory, deep feed-forward neural network (CBLDNN) with generative adversarial training (GAT). Our system aims at obtaining better speech quality instead of only minimizing a mean square error (MSE). In the initial phase, we utilize log-mel filterbank and pitch features to warm up our CBLDNN in a multi-task manner. Thus, the information that contributes to separating speech and improving speech quality is integrated into the model. We execute GAT throughout the training, which makes the separated speech indistinguishable from the real one. We evaluate CBLDNN-GAT on WSJ0-2mix dataset. The experimental results show that the proposed model achieves 11.0dB signal-to-distortion ratio (SDR) improvement, which is the new state-of-the-art result.

up
0 users have voted:

Paper Details

Authors:
Chenxing Li, Lei Zhu, Shuang Xu, Peng Gao, Bo Xu
Submitted On:
22 April 2018 - 9:43pm
Short Link:
Type:
Poster
Event:
Presenter's Name:
Chenxing Li
Paper Code:
AASP-P11.7
Document Year:
2018
Cite

Document Files

conference_poster_4.pdf

(169)

Subscribe

[1] Chenxing Li, Lei Zhu, Shuang Xu, Peng Gao, Bo Xu, "CBLDNN-BASED SPEAKER-INDEPENDENT SPEECH SEPARATION VIA GENERATIVE ADVERSARIAL TRAINING", IEEE SigPort, 2018. [Online]. Available: http://sigport.org/3140. Accessed: Apr. 23, 2019.
@article{3140-18,
url = {http://sigport.org/3140},
author = {Chenxing Li; Lei Zhu; Shuang Xu; Peng Gao; Bo Xu },
publisher = {IEEE SigPort},
title = {CBLDNN-BASED SPEAKER-INDEPENDENT SPEECH SEPARATION VIA GENERATIVE ADVERSARIAL TRAINING},
year = {2018} }
TY - EJOUR
T1 - CBLDNN-BASED SPEAKER-INDEPENDENT SPEECH SEPARATION VIA GENERATIVE ADVERSARIAL TRAINING
AU - Chenxing Li; Lei Zhu; Shuang Xu; Peng Gao; Bo Xu
PY - 2018
PB - IEEE SigPort
UR - http://sigport.org/3140
ER -
Chenxing Li, Lei Zhu, Shuang Xu, Peng Gao, Bo Xu. (2018). CBLDNN-BASED SPEAKER-INDEPENDENT SPEECH SEPARATION VIA GENERATIVE ADVERSARIAL TRAINING. IEEE SigPort. http://sigport.org/3140
Chenxing Li, Lei Zhu, Shuang Xu, Peng Gao, Bo Xu, 2018. CBLDNN-BASED SPEAKER-INDEPENDENT SPEECH SEPARATION VIA GENERATIVE ADVERSARIAL TRAINING. Available at: http://sigport.org/3140.
Chenxing Li, Lei Zhu, Shuang Xu, Peng Gao, Bo Xu. (2018). "CBLDNN-BASED SPEAKER-INDEPENDENT SPEECH SEPARATION VIA GENERATIVE ADVERSARIAL TRAINING." Web.
1. Chenxing Li, Lei Zhu, Shuang Xu, Peng Gao, Bo Xu. CBLDNN-BASED SPEAKER-INDEPENDENT SPEECH SEPARATION VIA GENERATIVE ADVERSARIAL TRAINING [Internet]. IEEE SigPort; 2018. Available from : http://sigport.org/3140