Sorry, you need to enable JavaScript to visit this website.

facebooktwittermailshare

LOW-LATENCY SOUND SOURCE SEPARATION USING DEEP NEURAL NETWORKS

Abstract: 

Sound source separation at low-latency requires that each in- coming frame of audio data be processed at very low de- lay, and outputted as soon as possible. For practical pur- poses involving human listeners, a 20 ms algorithmic delay is the uppermost limit which is comfortable to the listener. In this paper, we propose a low-latency (algorithmic delay ≤ 20 ms) deep neural network (DNN) based source sepa- ration method. The proposed method takes advantage of an extended past context, outputting soft time-frequency mask- ing filters which are then applied to incoming audio frames to give better separation performance as compared to NMF baseline. Acoustic mixtures from five pairs of speakers from CMU Arctic database were used for the experiments. At least 1 dB average improvement in source to distortion ratios (SDR) was observed in our DNN-based system over a low- latency NMF baseline for different processing and analysis frame lengths. The effect of incorporating previous temporal context into DNN inputs yielded significant improvements in SDR for short processing frame lengths.

up
0 users have voted:

Paper Details

Authors:
Tom Barker, Niels Henrik Pontoppidan, Tuomas Virtanen
Submitted On:
8 December 2016 - 3:27pm
Short Link:
Type:
Poster
Event:
Presenter's Name:
Gaurav Naithani
Document Year:
2016
Cite

Document Files

GlobalSIP_poster2.pdf

(199 downloads)

Subscribe

[1] Tom Barker, Niels Henrik Pontoppidan, Tuomas Virtanen, "LOW-LATENCY SOUND SOURCE SEPARATION USING DEEP NEURAL NETWORKS", IEEE SigPort, 2016. [Online]. Available: http://sigport.org/1426. Accessed: Dec. 18, 2017.
@article{1426-16,
url = {http://sigport.org/1426},
author = {Tom Barker; Niels Henrik Pontoppidan; Tuomas Virtanen },
publisher = {IEEE SigPort},
title = {LOW-LATENCY SOUND SOURCE SEPARATION USING DEEP NEURAL NETWORKS},
year = {2016} }
TY - EJOUR
T1 - LOW-LATENCY SOUND SOURCE SEPARATION USING DEEP NEURAL NETWORKS
AU - Tom Barker; Niels Henrik Pontoppidan; Tuomas Virtanen
PY - 2016
PB - IEEE SigPort
UR - http://sigport.org/1426
ER -
Tom Barker, Niels Henrik Pontoppidan, Tuomas Virtanen. (2016). LOW-LATENCY SOUND SOURCE SEPARATION USING DEEP NEURAL NETWORKS. IEEE SigPort. http://sigport.org/1426
Tom Barker, Niels Henrik Pontoppidan, Tuomas Virtanen, 2016. LOW-LATENCY SOUND SOURCE SEPARATION USING DEEP NEURAL NETWORKS. Available at: http://sigport.org/1426.
Tom Barker, Niels Henrik Pontoppidan, Tuomas Virtanen. (2016). "LOW-LATENCY SOUND SOURCE SEPARATION USING DEEP NEURAL NETWORKS." Web.
1. Tom Barker, Niels Henrik Pontoppidan, Tuomas Virtanen. LOW-LATENCY SOUND SOURCE SEPARATION USING DEEP NEURAL NETWORKS [Internet]. IEEE SigPort; 2016. Available from : http://sigport.org/1426