Sorry, you need to enable JavaScript to visit this website.

ICASSP is the world’s largest and most comprehensive technical conference focused on signal processing and its applications. The 2019 conference will feature world-class presentations by internationally renowned speakers, cutting-edge session topics and provide a fantastic opportunity to network with like-minded professionals from around the world. Visit website.

In this paper we examine dropout approaches in a Long Short Term Memory (LSTM) based automatic speech recognition (ASR) system trained with the Connectionist Temporal Classification (CTC) loss function. In particular, using an Eesen based LSTM-CTC speech recognition system, we present dropout implementations that result in significant improvements in speech recognizer performance on Librispeech and GALE Arabic datasets, with 24.64% and 13.75% relative reduction in word error rates (WER) from their respective baselines.

Categories:
44 Views

This paper presents the formulation and analysis of a novel distributed maximum likelihood algorithm that utilizes a first-order optimization scheme. The proposed approach utilizes a static average consensus algorithm to reach agreement on the initial condition to the iterative optimization scheme and a dynamic average consensus algorithm to reach agreement on the gradient direction. The current distributed algorithm is guaranteed to exponentially recover the performance of the centralized algorithm.

Categories:
4 Views

We address the problem of constructing false data injection (FDI) attacks that can bypass the bad data detector (BDD) of a power grid. The attacker is assumed to have access to only power flow measurement data traces (collected over a limited period of time) and no other prior knowledge about the grid. Existing related algorithms are formulated under the assumption that the attacker has access to measurements collected over a long (asymptotically infinite) time period, which may not be realistic.

Categories:
7 Views

Cross-modal sketch-photo recognition is of vital importance
in law enforcement and public security. Most existing methods
are dedicated to bridging the gap between the low-level
visual features of sketches and photo images, which is limited
due to intrinsic differences in pixel values. In this paper, based
on the intuition that sketches and photo images are highly correlated
in the semantic domain, we propose to jointly utilize
the low-level visual features and high-level facial attributes to

Categories:
10 Views

Cross-modal sketch-photo recognition is of vital importance
in law enforcement and public security. Most existing methods
are dedicated to bridging the gap between the low-level
visual features of sketches and photo images, which is limited
due to intrinsic differences in pixel values. In this paper, based
on the intuition that sketches and photo images are highly correlated
in the semantic domain, we propose to jointly utilize
the low-level visual features and high-level facial attributes to

Categories:
5 Views

We propose an end-to-end model based on convolutional and recurrent neural networks for speech enhancement. Our model is purely data-driven and does not make any assumptions about the type or the stationarity of the noise. In contrast to existing methods that use multilayer perceptrons (MLPs), we employ both convolutional and recurrent neural network architectures. Thus, our approach allows us to exploit local structures in both the frequency and temporal domains.

Categories:
38 Views

Our work is based on a recently introduced mathematical theory of deep convolutional neural networks (DCNNs).
It was shown that DCNNs are stable with respect to deformations of bandlimited input functions.
In the present paper, we generalize this result: We prove deformation stability on Sobolev spaces.
Further, we show a weak form of deformation stability for the whole input space L2.
The basic components of DCNNs are semi-discrete frames.
For practical applications, a concrete choice is necessary.

Categories:
138 Views

Pages