Sorry, you need to enable JavaScript to visit this website.

Incorporating Intra-Spectral Dependencies With A Recurrent Output Layer For Improved Speech Enhancement

Citation Author(s):
Khandokar Md. Nayem, Donald S. Williamson
Submitted by:
KHANDOKAR MD NAYEM
Last updated:
13 October 2019 - 1:29pm
Document Type:
Poster
Document Year:
2019
Event:
Presenters:
Khandokar Md. Nayem
Paper Code:
143
 

Deep-learning based speech enhancement systems have offered tremendous gains, where the best performing approaches use long short-term memory (LSTM) recurrent neural networks (RNNs) to model temporal speech correlations. These models, however, do not consider the frequency-level correlations within a single time frame, as spectral dependencies along the frequency axis are often ignored. This results in inaccurate frequency responses that negatively affect perceptual quality and intelligibility. We propose a deep-learning approach that considers temporal and frequency-level dependencies. More specifically, we enforce spectral-level dependencies within each spectral time frame through the introduction of a recurrent output layer that models the Markovian assumption along the frequency axis. We evaluate our approach in a variety of speech and noise environments, and objectively show that this recurrent spectral layer offers performance gains over traditional approaches. We also show that our approach outperforms recent approaches that consider frequency-level dependencies.

up
0 users have voted: