Sorry, you need to enable JavaScript to visit this website.

facebooktwittermailshare

MULTICHANNEL SPEECH SEPARATION WITH RECURRENT NEURAL NETWORKS FROM HIGH-ORDER AMBISONICS RECORDINGS

Abstract: 

We present a source separation system for high-order ambisonics (HOA) contents. We derive a multichannel spatial filter from a mask estimated by a long short-term memory (LSTM) recurrent neural network. We combine one channel of the mixture with the outputs of basic HOA beamformers as inputs to the LSTM, assuming that we know the directions of arrival of the directional sources. In our experiments, the speech of interest can be corrupted either by diffuse noise or by an equally loud competing speaker. We show that adding as input the output of the beamformer steered toward the competing speech in addition to that of the beamformer steered toward the target speech brings significant improvements in terms of word error rate.

up
0 users have voted:

Paper Details

Authors:
Emmanuel Vincent, Alexandre Guérin
Submitted On:
19 April 2018 - 5:18pm
Short Link:
Type:
Presentation Slides
Event:
Presenter's Name:
Laureline Perotin
Paper Code:
AASP-L2.2
Document Year:
2018
Cite

Document Files

perotin.pdf

(26 downloads)

Subscribe

[1] Emmanuel Vincent, Alexandre Guérin, "MULTICHANNEL SPEECH SEPARATION WITH RECURRENT NEURAL NETWORKS FROM HIGH-ORDER AMBISONICS RECORDINGS", IEEE SigPort, 2018. [Online]. Available: http://sigport.org/3031. Accessed: May. 21, 2018.
@article{3031-18,
url = {http://sigport.org/3031},
author = {Emmanuel Vincent; Alexandre Guérin },
publisher = {IEEE SigPort},
title = {MULTICHANNEL SPEECH SEPARATION WITH RECURRENT NEURAL NETWORKS FROM HIGH-ORDER AMBISONICS RECORDINGS},
year = {2018} }
TY - EJOUR
T1 - MULTICHANNEL SPEECH SEPARATION WITH RECURRENT NEURAL NETWORKS FROM HIGH-ORDER AMBISONICS RECORDINGS
AU - Emmanuel Vincent; Alexandre Guérin
PY - 2018
PB - IEEE SigPort
UR - http://sigport.org/3031
ER -
Emmanuel Vincent, Alexandre Guérin. (2018). MULTICHANNEL SPEECH SEPARATION WITH RECURRENT NEURAL NETWORKS FROM HIGH-ORDER AMBISONICS RECORDINGS. IEEE SigPort. http://sigport.org/3031
Emmanuel Vincent, Alexandre Guérin, 2018. MULTICHANNEL SPEECH SEPARATION WITH RECURRENT NEURAL NETWORKS FROM HIGH-ORDER AMBISONICS RECORDINGS. Available at: http://sigport.org/3031.
Emmanuel Vincent, Alexandre Guérin. (2018). "MULTICHANNEL SPEECH SEPARATION WITH RECURRENT NEURAL NETWORKS FROM HIGH-ORDER AMBISONICS RECORDINGS." Web.
1. Emmanuel Vincent, Alexandre Guérin. MULTICHANNEL SPEECH SEPARATION WITH RECURRENT NEURAL NETWORKS FROM HIGH-ORDER AMBISONICS RECORDINGS [Internet]. IEEE SigPort; 2018. Available from : http://sigport.org/3031