Sorry, you need to enable JavaScript to visit this website.

ICASSP is the world’s largest and most comprehensive technical conference focused on signal processing and its applications. The 2019 conference will feature world-class presentations by internationally renowned speakers, cutting-edge session topics and provide a fantastic opportunity to network with like-minded professionals from around the world. Visit website.

Techniques for multi-lingual and cross-lingual speech recognition can help in low resource scenarios, to bootstrap systems and enable analysis of new languages and domains. End-to-end approaches, in particular sequence-based techniques, are attractive because of their simplicity and elegance. While it is possible to integrate traditional multi-lingual bottleneck feature extractors as front-ends, we show that end-to-end multi-lingual training of sequence models is effective on context independent models trained using Connectionist Temporal Classification (CTC) loss.

Categories:
14 Views

Sonobuoy fields, comprising a network of transmitters and receivers, are commonly deployed to find and track underwater targets. For a given environment and sonobuoy field layout, the performance of such a field depends on the scheduling, that is, deciding which source should transmit, and which from a library of available waveforms should be transmitted at any given time. In this paper, we propose a novel scheduling framework based on multi-objective optimization. Specifically, we pose the two tasks of the sonobuoy field—tracking and searching—as separate, competing, objective functions.

Categories:
92 Views

The millimeter wave WLAN standard can be used for joint communication-radar by exploiting the waveform preamble as a radar pulse. The velocity estimation accuracy with this approach, however, is limited due to the short integration time. A physical increase in the radar pulse integration duration, however, leads to a decrease in the communication data rate.

Categories:
27 Views

This paper focuses on multi-sensor anomaly detection for moving cognitive agents using both external and private first-person visual observations. Both observation types are used to characterize agents’ motion in a given environment. The proposed method generates locally uniform motion models by dividing a Gaussian process that approximates agents’ displacements on the scene and provides a Shared Level (SL) self-awareness based on Environment Centered (EC) models.

Categories:
23 Views

Storage, browsing and analysis of human activity videos can be significantly facilitated by automated video summarization. Unsupervised key-frame extraction remains the most widely applicable technique for summarizing activity videos. However, their specific properties make the problem difficult to solve. Typical relevant algorithms fall under the video frame clustering or the dictionary-of-representatives families, with salient dictionary learning having been recently proposed.

Categories:
6 Views

Selective active noise control (SANC) is a method
to select a pre-trained control filter for different
primary noises, instead of using conventional
real-time computation of the control filter coefficients.
This paper:
1. Proves the frequency-band-match method.
2. Propose a SANC based on a partitioned frequency
domain filter.
3. Both simulation and real-time experiment is
carried out to validate the algorithm.

Categories:
26 Views

Phonetic variability is one of the primary challenges in short duration speaker verification. This paper proposes a novel method that modifies the standard normal distribution prior in the total variability model to use a mixture of Gaussians as the prior distribution. The proposed speaker-phonetic vectors are then estimated from the posterior probability of latent variables, and each vector has a phonetic meaning.

Categories:
41 Views

Pages