Sorry, you need to enable JavaScript to visit this website.

ICASSP is the world’s largest and most comprehensive technical conference focused on signal processing and its applications. The 2019 conference will feature world-class presentations by internationally renowned speakers, cutting-edge session topics and provide a fantastic opportunity to network with like-minded professionals from around the world. Visit website

Analog audio effects and synthesizers often owe their distinct sound to circuit nonlinearities. Faithfully modeling such significant aspect of the original sound in virtual analog software can prove challenging. The current work proposes a generic data-driven approach to virtual analog modeling and applies it to the Fender Bassman 56F-A vacuum-tube amplifier. Specifically, a feedforward variant of the WaveNet deep neural network is trained to carry out a regression on audio waveform samples from input to output of a SPICE model of the tube amplifier.

Categories:
26 Views

In this paper, we adapt Recurrent Neural Networks with Stochastic Layers, which are the state-of-the-art for generating text, music and speech, to the problem of acoustic novelty detection. By integrating uncertainty into the hidden states, this type of network is able to learn the distribution of complex sequences. Because the learned distribution can be calculated explicitly in terms of probability, we can evaluate how likely an observation is then detect low-probability events as novel.

Categories:
1 Views

This work aims to develop an end-to-end solution for seizure onset detection. We design the SeizNet, a Convolutional Neural Network for seizure detection. To compare SeizNet with traditional machine learning approach, a baseline classifier is implemented using spectrum band power features with Support Vector Machines (BPsvm). We explore the possibility to use the least number of channels for accurate seizure detection by evaluating SeizNet and BPsvm approaches using all channels and two channels settings respectively.

Categories:
13 Views

We propose a Denoising Autoencoder (DAE) for speaker recognition, trained to map each individual ivector to the mean of all ivectors belonging to that particular speaker. The aim of this DAE is to compensate for inter-session variability and increase the discriminative power of the ivectors prior to PLDA scoring. We test the proposed approach on the MCE 2018 1st Multi-target speaker detection and identification Challenge Evaluation. This evaluation presents a call-center fraud detection scenario: given a speech segment, detect if it belongs to any of the speakers in a blacklist.

Categories:
57 Views

Visual relationship recognition, as a challenging task used to distinguish the interactions between object pairs, has received much attention recently. Considering the fact that most visual relationships are semantic concepts defined by human beings, there are many human knowledge, or priors, hidden in them, which haven’t been fully exploited by existing methods.

Categories:
37 Views

An underdetermined multi-measurement vector linear regression problem is considered where the parameter matrix is row-sparse and where an additional constraint fixes the number of nonzero elements in the active rows. Even if this additional constraint offers side structure information that could be exploited to improve the estimation accuracy, it is highly nonconvex and must be dealt with with caution. A detection algorithm is proposed that capitalizes on compressed sensing results and on the generalized distributive law (message passing on factor graphs).

Categories:
19 Views

The Just Noticeable Difference (JND) reveals the minimum distortion that the Human Visual System (HVS) can perceive. Traditional studies on JND mainly focus on background luminance adaptation and contrast masking. However, the HVS does not perceive visual content based on individual pixels or blocks, but on the entire image. In this work, we conduct an interactive subjective visual quality study on the Picture-level JND (PJND) of compressed stereo images. The study, which involves 48 subjects and 10 stereoscopic images compressed with H.265 intra coding and JPEG2000, includes two parts.

Categories:
139 Views

Pages