Sorry, you need to enable JavaScript to visit this website.

facebooktwittermailshare

A Comparison of Boosted Deep Neural Networks for Voice Activity Detection

Abstract: 

Voice activity detection (VAD) is an integral part of speech processing for real world problems, and a lot of work has been done to improve VAD performance. Of late, deep neural networks have been used to detect the presence of speech and this has offered tremendous gains. Unfortunately, these efforts have been either restricted to feed-forward neural networks that do not adequately capture frequency and temporal correlations, or the recurrent architectures have not been adequately tested in noisy environments. In this paper, we investigate different neural network configurations for voice activity detection. More specifically, we explore solutions that incorporate multi-resolution stacking and ensemble learning using convolutional, long short-term memory (LSTM), and dilated convolutional neural network architectures. We evaluate our approach using various speech signals that are captured in different amounts of noise. Our results show that a multi-resolution ensemble approach using LSTM recurrent neural networks performs best. This is demonstrated for seen and unseen testing scenarios.

up
0 users have voted:

Paper Details

Authors:
Harshit Krishnakumar, Donald S. Williamson
Submitted On:
12 November 2019 - 10:09pm
Short Link:
Type:
Poster
Event:
Presenter's Name:
Donald S. Williamson
Paper Code:
1570567217
Document Year:
2019
Cite

Document Files

williamson.pdf

(20)

Subscribe

[1] Harshit Krishnakumar, Donald S. Williamson, "A Comparison of Boosted Deep Neural Networks for Voice Activity Detection", IEEE SigPort, 2019. [Online]. Available: http://sigport.org/4952. Accessed: Dec. 12, 2019.
@article{4952-19,
url = {http://sigport.org/4952},
author = {Harshit Krishnakumar; Donald S. Williamson },
publisher = {IEEE SigPort},
title = {A Comparison of Boosted Deep Neural Networks for Voice Activity Detection},
year = {2019} }
TY - EJOUR
T1 - A Comparison of Boosted Deep Neural Networks for Voice Activity Detection
AU - Harshit Krishnakumar; Donald S. Williamson
PY - 2019
PB - IEEE SigPort
UR - http://sigport.org/4952
ER -
Harshit Krishnakumar, Donald S. Williamson. (2019). A Comparison of Boosted Deep Neural Networks for Voice Activity Detection. IEEE SigPort. http://sigport.org/4952
Harshit Krishnakumar, Donald S. Williamson, 2019. A Comparison of Boosted Deep Neural Networks for Voice Activity Detection. Available at: http://sigport.org/4952.
Harshit Krishnakumar, Donald S. Williamson. (2019). "A Comparison of Boosted Deep Neural Networks for Voice Activity Detection." Web.
1. Harshit Krishnakumar, Donald S. Williamson. A Comparison of Boosted Deep Neural Networks for Voice Activity Detection [Internet]. IEEE SigPort; 2019. Available from : http://sigport.org/4952