Sorry, you need to enable JavaScript to visit this website.

facebooktwittermailshare

Multi-Task Joint-Learning for Robust Voice Activity Detection

Abstract: 

Model based VAD approaches have been widely used and
achieved success in practice. These approaches usually cast
VAD as a frame-level classification problem and employ statistical
classifiers, such as Gaussian Mixture Model (GMM) or
Deep Neural Network (DNN) to assign a speech/silence label
for each frame. Due to the frame independent assumption classification,
the VAD results tend to be fragile. To address this
problem, in this paper, a new structured multi-frame prediction
DNN approach is proposed to improve the segment-level
VAD performance. During DNN training, VAD labels of multiple
consecutive frames are concatenated together as targets and
jointly trained with a speech enhancement task to achieve robustness
under noisy conditions. During testing, the VAD label
for each frame is obtained by merging the prediction results
from neighbouring frames. Experiments on an Aurora 4
dataset showed that, conventional DNN based VAD has poor
and unstable prediction performance while the proposed multitask
trained VAD is much more robust.

up
0 users have voted:

Paper Details

Authors:
Yanmin Qian, Kai Yu
Submitted On:
15 October 2016 - 3:51am
Short Link:
Type:
Presentation Slides
Event:
Presenter's Name:
Yimeng Zhuang
Paper Code:
35
Document Year:
2016
Cite

Document Files

zhuang-iscslp16-slides.pdf

(348)

Subscribe

[1] Yanmin Qian, Kai Yu, "Multi-Task Joint-Learning for Robust Voice Activity Detection", IEEE SigPort, 2016. [Online]. Available: http://sigport.org/1220. Accessed: Sep. 27, 2020.
@article{1220-16,
url = {http://sigport.org/1220},
author = {Yanmin Qian; Kai Yu },
publisher = {IEEE SigPort},
title = {Multi-Task Joint-Learning for Robust Voice Activity Detection},
year = {2016} }
TY - EJOUR
T1 - Multi-Task Joint-Learning for Robust Voice Activity Detection
AU - Yanmin Qian; Kai Yu
PY - 2016
PB - IEEE SigPort
UR - http://sigport.org/1220
ER -
Yanmin Qian, Kai Yu. (2016). Multi-Task Joint-Learning for Robust Voice Activity Detection. IEEE SigPort. http://sigport.org/1220
Yanmin Qian, Kai Yu, 2016. Multi-Task Joint-Learning for Robust Voice Activity Detection. Available at: http://sigport.org/1220.
Yanmin Qian, Kai Yu. (2016). "Multi-Task Joint-Learning for Robust Voice Activity Detection." Web.
1. Yanmin Qian, Kai Yu. Multi-Task Joint-Learning for Robust Voice Activity Detection [Internet]. IEEE SigPort; 2016. Available from : http://sigport.org/1220