Source Separation and Signal Enhancement

A Dynamic Latent Variable Model for Source Separation

Read more about A Dynamic Latent Variable Model for Source Separation
Log in to post comments

We propose a novel latent variable model for learning latent bases for time-varying non-negative data. Our model uses a mixture multinomial as the likelihood function and proposes a Dirichlet distribution with dynamic parameters as a prior, which we call the dynamic Dirichlet prior. An expectation maximization (EM) algorithm is developed for estimating the parameters of the proposed model.

ICASSP_18.pdf

ICASSP_18.pdf (514)

Categories:: Source Separation and Signal Enhancement

32 Views

TasNet: time-domain audio separation network for real-time, single-channel speech separation

Robust speech processing in multi-talker environments requires effective speech separation. Recent deep learning systems have made significant progress toward solving this problem, yet it remains challenging particularly in real-time, short latency applications. Most methods attempt to construct a mask for each source in time-frequency representation of the mixture signal which is not necessarily an optimal representation for speech separation.

ICASSP2018-poster.pdf

ICASSP2018-poster.pdf (746)

Categories:: Source Separation and Signal Enhancement
Speech Enhancement (SPE-ENHA)
Source separation (MLR-SSEP)
Neural network learning (MLR-NNLR)

86 Views

ITERATIVE DEEP NEURAL NETWORKS FOR SPEAKER-INDEPENDENT BINAURAL BLIND SPEECH SEPARATION

ICASSPDemo.pptx

Audio examples (424)

Categories:: Source Separation and Signal Enhancement

5 Views

BSS EVAL OR PEASS? PREDICTING THE PERCEPTION OF SINGING-VOICE SEPARATION

Read more about BSS EVAL OR PEASS? PREDICTING THE PERCEPTION OF SINGING-VOICE SEPARATION
Log in to post comments

There is some uncertainty as to whether objective metrics for predicting the perceived quality of audio source separation are sufficiently accurate. This issue was investigated by employing a revised experimental methodology to collect subjective ratings of sound quality and interference of singing-voice recordings that have been extracted from musical mixtures using state-of-the-art audio source separation. A correlation analysis between the experimental data and the measures of two objective evaluation toolkits, BSS Eval and PEASS, was performed to assess their performance.

icassp18_poster_ward_et_al.pdf

icassp18_poster_ward_et_al.pdf (492)

Categories:: Source Separation and Signal Enhancement

14 Views

Language and Noise Transfer in Speech Enhancement Generative Adversarial Network

Read more about Language and Noise Transfer in Speech Enhancement Generative Adversarial Network
Log in to post comments

language-noise-transfer.pdf

language-noise-transfer.pdf (428)

Categories:: Source Separation and Signal Enhancement
Machine Learning for Signal Processing

10 Views

Correlated Tensor Factorization for Audio Source Separation

Read more about Correlated Tensor Factorization for Audio Source Separation
Log in to post comments

This paper presents an ultimate extension of nonnegative matrix factorization (NMF) for audio source separation based on full covariance modeling over all the time-frequency (TF) bins of the complex spectrogram of an observed mixture signal. Although NMF has been widely used for decomposing an observed power spectrogram in a TF-wise manner, it has a critical limitation that the phase values of interdependent TF bins cannot be dealt with.

icassp-2018-yoshii-poster.pdf

icassp-2018-yoshii-poster.pdf (554)

Categories:: Source Separation and Signal Enhancement

45 Views

Deep Learning Based Speech Beamforming

Read more about Deep Learning Based Speech Beamforming
Log in to post comments

Multi-channel speech enhancement with ad-hoc sensors has been a challenging task. Speech model guided beamforming algorithms are able to recover natural sounding speech, but the speech models tend to be oversimplified or the inference would otherwise be too complicated. On the other hand, deep learning based enhancement approaches are able to learn complicated speech distributions and perform efficient inference, but they are unable to deal with variable number of input channels.

deep learning based speech beamforming.pdf

DeepBeam (378)

Categories:: Source Separation and Signal Enhancement

46 Views

Crowdsourced Pairwise-Comparison for Source Separation Evaluation

Read more about Crowdsourced Pairwise-Comparison for Source Separation Evaluation
Log in to post comments

Automated objective methods of audio source separation evaluation are fast, cheap, and require little effort by the investigator. However, their output often correlates poorly with human quality assessments and typically require ground-truth (perfectly separated) signals to evaluate algorithm performance. Subjective multi-stimulus human ratings (e.g. MUSHRA) of audio quality are the gold standard for many tasks, but they are slow and require a great deal of effort to recruit participants and run listening tests.

cartwright_caqe_icassp_2018_poster.pdf

cartwright_caqe_icassp_2018_poster.pdf (533)

Categories:: Source Separation and Signal Enhancement

28 Views

PRIMARY-AMBIENT SOURCE SEPARATION FOR UPMIXING TO SURROUNDING SOUND SYSTEMS

Read more about PRIMARY-AMBIENT SOURCE SEPARATION FOR UPMIXING TO SURROUNDING SOUND SYSTEMS
Log in to post comments

Extracting spatial information from an audio recording is a necessary step for upmixing stereo tracks to be played on surround systems. One important spatial feature is the perceived direction of the different audio sources in the recording, which determines how to remix the different sources in the surround system. The focus of this paper is the separation of two types of audio sources: primary (direct) and ambient (surrounding) sources. Several approaches have been proposed to solve the problem, based mainly on the correlation between the two channels in the stereo recording.