ICASSP 2018

ICASSP is the world’s largest and most comprehensive technical conference focused on signal processing and its applications. The 2019 conference will feature world-class presentations by internationally renowned speakers, cutting-edge session topics and provide a fantastic opportunity to network with like-minded professionals from around the world. Visit website.

WEIGHTED AND MULTI-TASK LOSS FOR RARE AUDIO EVENT DETECTION

Read more about WEIGHTED AND MULTI-TASK LOSS FOR RARE AUDIO EVENT DETECTION
Log in to post comments

ICASSP2018_Poster.pdf

ICASSP2018_Poster.pdf (744)

Categories:: Audio and Acoustic Signal Processing

4 Views

ENABLING EARLY AUDIO EVENT DETECTION WITH NEURAL NETWORKS

Read more about ENABLING EARLY AUDIO EVENT DETECTION WITH NEURAL NETWORKS
Log in to post comments

icassp2018.pdf

icassp2018.pdf (377)

Categories:: Audio and Acoustic Signal Processing

5 Views

BirdVox-full-night: a dataset and website for avian flight call detection.

Read more about BirdVox-full-night: a dataset and website for avian flight call detection.
Log in to post comments

This article addresses the automatic detection of vocal, nocturnally migrating birds from a network of acoustic sensors.
Thus far, owing to the lack of annotated continuous recordings, existing methods had been benchmarked in a binary classification setting (presence vs. absence).
Instead, with the aim of comparing them in event detection, we release BirdVox-full-night, a dataset of 62 hours of audio comprising 35402 flight calls of nocturnally migrating birds, as recorded from 6 sensors.

lostanlen_icassp2018_poster.pdf

lostanlen_icassp2018_poster.pdf (757)

Categories:: Bioacoustics and Medical Acoustics

294 Views

Recognition of Faces and Facial Attributes using Accumulative Local Sparse Representations

This paper addresses the problem of automated recognition of faces and facial attributes by proposing a new general approach called Accumulative Local Sparse Representation (ALSR). In the learning stage, we build a general dictionary of patches that are extracted from face images in a dense manner on a grid. In the testing stage, patches of the query image are sparsely represented using a \em local dictionary. This dictionary contains similar atoms of the general dictionary that are spatially in the same neighborhood.

2018-ICASSP-ALSR.pptx

PowerPoint Presentation for ALSR by Mery and Banerjee (442)

Categories:: Biometrics

17 Views

A Greedy Pursuit Algorithm For Separating Signals From Nonlinear Compressive Observations

In this paper we study the unmixing problem which aims
to separate a set of structured signals from their superposition.
In this paper, we consider the scenario in which
the mixture is observed via nonlinear compressive measurements.
We present a fast, robust, greedy algorithm called
Unmixing Matching Pursuit (UnmixMP) to solve this problem.
We prove rigorously that the algorithm can recover the
constituents from their noisy nonlinear compressive measurements
with arbitrarily small error. We compare our algorithm

A Greedy Pursuit Algorithm For Separating Signals From.pdf

A Greedy Pursuit Algorithm For Separating Signals From.pdf (351)

Categories:: Learning theory and algorithms (MLR-LEAR)

6 Views

Generative Adversarial Source Separation

Read more about Generative Adversarial Source Separation
Log in to post comments

pres.pdf

pres.pdf (475)

Categories:: Audio and Acoustic Signal Processing

16 Views

REALIZING DIRECTIONAL SOUND SOURCE IN FDTD METHOD BY ESTIMATING INITIAL VALUE

Read more about REALIZING DIRECTIONAL SOUND SOURCE IN FDTD METHOD BY ESTIMATING INITIAL VALUE
Log in to post comments

Wave-based acoustic simulation methods are studied actively for predicting acoustical phenomena. Finite-difference timedomain (FDTD) method is one of the most popular methods owing to its straightforwardness of calculating an impulse response. In an FDTD simulation, an omnidirectional sound source is usually adopted, which is not realistic because the real sound sources often have specific directivities. However, there is very little research on imposing a directional sound source into FDTD methods.

ICASSP2018Poster.pdf

ICASSP2018Poster.pdf (563)

Categories:: Audio and Acoustic Signal Processing

29 Views

Maximal Figure-of-Merit Embedding for Multi-label Audio Classification

Read more about Maximal Figure-of-Merit Embedding for Multi-label Audio Classification
Log in to post comments

Presentation.pdf

Presentation.pdf (777)

Categories:: Audio and Acoustic Signal Processing

56 Views

Vocal melody extraction using patch-based CNN

Read more about Vocal melody extraction using patch-based CNN
Log in to post comments

A patch-based convolutional neural network (CNN) model presented in this paper for vocal melody extraction in polyphonic music is inspired from object detection in image processing. The input of the model is a novel time-frequency representation which enhances the pitch contours and suppresses the harmonic components of a signal. This succinct data representation and the patch-based CNN model enable an efficient training process with limited labeled data. Experiments on various datasets show excellent speed and competitive accuracy comparing to other deep learning approaches.