Sorry, you need to enable JavaScript to visit this website.

ICASSP is the world’s largest and most comprehensive technical conference focused on signal processing and its applications. The 2019 conference will feature world-class presentations by internationally renowned speakers, cutting-edge session topics and provide a fantastic opportunity to network with like-minded professionals from around the world. Visit website.

It is commonly believed that multipath hurts various audio processing algorithms. At odds with this belief, we show that multipath in fact helps sound source separation, even with very simple propagation models. Unlike most existing methods, we neither ignore the room impulse responses, nor we attempt to estimate them fully. We rather assume to know the positions of a few virtual microphones generated by echoes and we show how this gives us enough spatial diversity to get a performance boost over the anechoic case.

Categories:
46 Views

Informed source separation (ISS) uses source separation for extracting audio objects out of their downmix given some pre-computed parameters. In recent years, non-negative tensor factorization (NTF) has proven to be a good choice for compressing audio objects at an encoding stage. At the decoding stage, these parameters are used to separate the downmix with Wiener-filtering. The quantized NTF parameters have to be encoded to a bitstream prior to transmission.

Categories:
11 Views

This paper presents a new method --- adversarial advantage actor-critic (Adversarial A2C), which significantly improves the efficiency of dialogue policy learning in task-completion dialogue systems. Inspired by generative adversarial networks (GAN), we train a discriminator to differentiate responses/actions generated by dialogue agents from responses/actions by experts.

Categories:
43 Views

In the partial relaxation approach, at each desired direction, the manifold structure of the remaining interfering signals impinging on the sensor array is relaxed, which results in closed form estimates for the interference parameters. By adopting this approach, in this paper, a new estimator based on the unconstrained covariance fitting problem is proposed. To obtain the null-spectra efficiently, an iterative rooting scheme based on the rational function approximation is applied.

Categories:
9 Views

This paper introduces architecture with high throughput, low on-chip memory, and efficient data access for Improved Dense Trajectories (iDT) as video representations for real-time action recognition. The iDT feature can capture long-term motion cues better than any existing deep feature, which makes it crucial in state-of-the-art action recognition systems.

Categories:
12 Views

In this paper, we build up a hybrid neural network (NN) for singing melody extraction from polyphonic music by imitating human pitch perception. For human hearing, there are two pitch perception models, the spectral model and the temporal model, in accordance with whether harmonics are resolved or not. Here, we first use NNs to implement individual models and evaluate their performance in the task of singing melody extraction.

Categories:
28 Views

Before the era of the neural network (NN), features extracted from auditory models have been applied to various speech applications and been demonstrated more robust against noise than conventional speech-processing features. What's the role of auditory models in the current NN era? Are they obsolete? To answer this question, we construct a NN with a generative auditory model embedded to process speech signals.

Categories:
4 Views

Before the era of the neural network (NN), features extracted from auditory models have been applied to various speech applications and been demonstrated more robust against noise than conventional speech-processing features. What’s the role
of auditory models in the current NN era? Are they obsolete?
To answer this question, we construct a NN with a generative auditory model embedded to process speech signals. The
generative auditory model consists of two stages, the stage of spectrum estimation in the logarithmic-frequency axis by

Categories:
33 Views

Estimation of conduction velocity (CV) is an important task in the analysis of surface electromyography (sEMG). The problem can be framed as estimation of a time-varying delay (TVD) between electrode recordings. In this paper we present an algorithm which incorporates information from multiple electrodes into a single TVD estimation. The algorithm uses a common all-pass filter to relate two groups of signals at a local level.

Categories:
11 Views

Pages