Source Separation and Signal Enhancement

Improving Universal Sound Separation Using Sound Classification Presentation

Read more about Improving Universal Sound Separation Using Sound Classification Presentation
Log in to post comments

Deep learning approaches have recently achieved impressive performance on both audio source separation and sound classification. Most audio source separation approaches focus only on separating sources belonging to a restricted domain of source classes, such as speech and music. However, recent work has demonstrated the possibility of "universal sound separation", which aims to separate acoustic sources from an open domain, regardless of their class.

etzinis_improving_icassp2020_slides.pdf

etzinis_improving_icassp2020_slides.pdf (546)

Categories:: Source Separation and Signal Enhancement

394 Views

PEVD-based Speech Enhancement in Reverberant Environments

Read more about PEVD-based Speech Enhancement in Reverberant Environments
Log in to post comments

The enhancement of noisy speech is important for applications involving human-to-human interactions, such as telecommunications and hearing aids, as well as human-to-machine interactions, such as voice-controlled systems and robot audition. In this work, we focus on reverberant environments. It is shown that, by exploiting the lack of correlation between speech and the late reflections, further noise reduction can be achieved. This is verified using simulations involving actual acoustic impulse responses and noise from the ACE corpus.

[ICASSP2020]_PEVD_based_Speech_Enhancement_in_Reverberant_Environments_Handout.pdf

[ICASSP2020]_PEVD_based_Speech_Enhancement_in_Reverberant_Environments_Handout.pdf (301)

Categories:: Source Separation and Signal Enhancement

47 Views

A Bayesian Generative Model With Gaussian Process Priors For Thermomechanical Analysis Of Micro-Resonators

Thermal analysis using resonating micro-electromechanical systems shows great promise in characterizing materials in the early stages of research. Through thermal cycles and actuation using a piezoelectric speaker, the resonant behaviour of a model drug, theophylline monohydrate, is measured across the surface whilst using a laser-Doppler vibrometer for readout. Acquired is a sequence of spectra that are strongly correlated in time, temperature and spatial location of the readout. Traditionally, each spectrum is analyzed individually to locate the resonance peak.

MLSP_2019_Poster_v7_final.pdf

MLSP_2019_Poster_v7_final.pdf (285)

Categories:: Biomedical signal processing
Source Separation and Signal Enhancement

47 Views

Speech Enhancement Using Polynomial Eigenvalue Decomposition

Read more about Speech Enhancement Using Polynomial Eigenvalue Decomposition
Log in to post comments

Speech enhancement is important for applications such as telecommunications, hearing aids, automatic speech recognition and voice-controlled system. The enhancement algorithms aim to reduce interfering noise while minimizing any speech distortion. In this work for speech enhancement, we propose to use polynomial matrices in order to exploit the spatial, spectral as well as temporal correlations between the speech signals received by the microphone array.

[WASPAA]_Speech_Enhancement_Using_PEVD_Handout.pdf

[WASPAA]_Speech_Enhancement_Using_PEVD_Handout.pdf (357)

Categories:: Source Separation and Signal Enhancement

134 Views

An Improved Measure of Musical Noise Based on Spectral Kurtosis

Read more about An Improved Measure of Musical Noise Based on Spectral Kurtosis
Log in to post comments

Audio processing methods operating on a time-frequency representation of the signal can introduce unpleasant sounding artifacts known as musical noise. These artifacts are observed in the context of audio coding, speech enhancement, and source separation. The change in kurtosis of the power spectrum introduced during the processing was shown to correlate with the human perception of musical noise in the context of speech enhancement, leading to the proposal of measures based on it. These baseline measures are here shown to correlate with human perception only in a limited manner.

poster_FINAL.pdf

poster_FINAL.pdf (736)

Categories:: Source Separation and Signal Enhancement

137 Views

Incorporating Intra-Spectral Dependencies With A Recurrent Output Layer For Improved Speech Enhancement

Deep-learning based speech enhancement systems have offered tremendous gains, where the best performing approaches use long short-term memory (LSTM) recurrent neural networks (RNNs) to model temporal speech correlations. These models, however, do not consider the frequency-level correlations within a single time frame, as spectral dependencies along the frequency axis are often ignored. This results in inaccurate frequency responses that negatively affect perceptual quality and intelligibility. We propose a deep-learning approach that considers temporal and frequency-level dependencies.

intraspectral2019mlsp_poster_v3.pdf

Intra-Spectra Recurrent Output Layer (287)

Categories:: Source Separation and Signal Enhancement
Neural network learning (MLR-NNLR)

52 Views

Fast MVAE: Joint separation and classification of mixed sources based on multichannel variational autoencoder with auxiliary classifier

Li2019ICASSP05poster_v2.pdf

Li2019ICASSP05poster_v2.pdf (347)

Categories:: Source Separation and Signal Enhancement

92 Views

Joint Separation and Dereverberation of Reverberant Mixture with Multichannel Variational Autoencoder

AASP_L4_2.pdf

AASP_L4_2.pdf (286)

Categories:: Source Separation and Signal Enhancement

84 Views

Incremental Binarization On Recurrent Neural Networks For Single-Channel Source Separation

This paper proposes a Bitwise Gated Recurrent Unit (BGRU) network for the single-channel source separation task. Recurrent Neural Networks (RNN) require several sets of weights within its cells, which significantly increases the computational cost compared to the fully-connected networks. To mitigate this increased computation, we focus on the GRU cells and quantize the feedforward procedure with binarized values and bitwise operations. The BGRU network is trained in two stages.

ICASSP_2019_Poster_final.pdf

Bitwise Gated Recurrent Units (307)

Categories:: Source Separation and Signal Enhancement
Speech Enhancement (SPE-ENHA)

10 Views

Speech Denoising by Parametric Resynthesis

Read more about Speech Denoising by Parametric Resynthesis
Log in to post comments

This work proposes the use of clean speech vocoder parameters
as the target for a neural network performing speech enhancement.
These parameters have been designed for text-to-speech
synthesis so that they both produce high-quality resyntheses
and also are straightforward to model with neural networks,
but have not been utilized in speech enhancement until now.
In comparison to a matched text-to-speech system that is given
the ground truth transcripts of the noisy speech, our model is

poster.pdf

poster.pdf (255)

Categories:: Source Separation and Signal Enhancement

7 Views

Source Separation and Signal Enhancement

Pages