Sorry, you need to enable JavaScript to visit this website.

This presentation introduces a Deep Learning model that performs classification of the Audio Scene in the subway environment. The final goal is to detect Screams and Shouts for surveillance purposes. The model is a combination of Deep Belief Network and Deep Neural Network, (generatively pre-trained within the DBN framework and fine-tuned discriminatively within the DNN framework), and is trained on a novel database of pseudo-real signals collected in the Paris metro.

Categories:
5 Views

Abstract—In most applications of sinusoidal models for speech
signal, an amplitude spectral envelope is necessary. This envelope
is not only assumed to fit the vocal tract filter response as
accurately as possible, but it should also exhibit slow varying
shapes across time. Indeed, time irregularities can generate
artifacts in signal manipulations or increase improperly the
features variance used in statistical models. In this letter, a
simple technique is suggested to improve this time regularity.

Categories:
5 Views

We propose signal reconstruction algorithms which utilize a guiding subspace that represents desired properties of reconstructed signals. Optimal reconstructed signals are shown to belong to a convex bounded set, called the ``reconstruction'' set. Iterative reconstruction algorithms, based on conjugate gradient methods, are developed to approximate optimal reconstructions with low memory and computational costs. Effectiveness of the proposed method is demonstrated with an application to image magnification.

Categories:
8 Views

With the strong growth of assistive and personal listening devices, natural sound rendering over headphones is becoming a necessity for prolonged listening in multimedia and virtual reality applications. The aim of natural sound rendering is to naturally recreate the sound scenes with the spatial and timbral quality as natural as possible, so as to achieve a truly immersive listening experience. However, rendering natural sound over headphones encounters many challenges. This tutorial article presents signal processing techniques to tackle these challenges to assist human listening.

Categories:
16 Views

Audio signals for moving pictures and video games are often linear combinations of primary and ambient components. In spatial audio analysis-synthesis, these mixed signals are usually decomposed into primary and ambient components to facilitate flexible spatial rendering and enhancement. Existing approaches such as principal component analysis (PCA) and least squares (LS) are widely used to perform this decomposition from stereo signals.

Categories:
4 Views

Audio signals for moving pictures and video games are often linear combinations of primary and ambient components. In spatial audio analysis-synthesis, these mixed signals are usually decomposed into primary and ambient components to facilitate flexible spatial rendering and enhancement. Existing approaches such as principal component analysis (PCA) and least squares (LS) are widely used to perform this decomposition from stereo signals.

Categories:
5 Views

Pages