Sorry, you need to enable JavaScript to visit this website.

ICASSP is the world’s largest and most comprehensive technical conference focused on signal processing and its applications. The ICASSP 2020 conference will feature world-class presentations by internationally renowned speakers, cutting-edge session topics and provide a fantastic opportunity to network with like-minded professionals from around the world. Visit website.

Scene text detection has received attention for years and achieved an impressive performance across various benchmarks. In this work, we propose an efficient and accurate approach to detect multi-oriented text in scene images. The proposed feature fusion mechanism allows us to use a shallower network to reduce the computational complexity. A self-attention mechanism is adopted to suppress false positive detections.

Categories:
15 Views

High-dimensional data structures, known as tensors, are fundamental in many applications, including multispectral imaging and color video processing. Compression of such huge amount of multidimensional data collected over time is of paramount importance, necessitating the process of quantization of measurements into discrete values. Furthermore, noise and issues related to the acquisition and transmission of signals frequently lead to unobserved, lost or corrupted measurements.

Categories:
21 Views

Generating accurate ground truth representations of human subjective experiences and judgements is essential for advancing our understanding of human-centered constructs such as emotions. Often, this requires the collection and fusion of annotations from several people where each one is subject to valuation disagreements, distraction artifacts, and other error sources.

Categories:
51 Views

Finite Rate of Innovation (FRI) theory considers sampling and reconstruction of classes of non-bandlimited continuous signals that have a small number of free parameters, such as a stream of Diracs. The task of reconstructing FRI signals from discrete samples is often transformed into a spectral estimation problem and solved using Prony's method and matrix pencil method which involve estimating signal subspaces. They achieve an optimal performance given by the Cramér-Rao bound yet break down at a certain peak signal-to-noise ratio (PSNR).

Categories:
16 Views

This contribution presents a novel approach for coherence-based signal enhancement. An estimator for the coherent-to-diffuse ratio (CDR) is devised, which exploits the concept of generalized magnitude coherence and thus, unlike common state-of-the-art schemes, can simultaneously take advantage of more than two microphones. Moreover, the speech enhancement by CDR-based spectral weighting is not performed as a post-filtering step, but by enhancing the most appropriate microphone signal.

Categories:
29 Views

A Turing machine is a model describing the fundamental limits of any realizable computer, digital signal processor (DSP), or field programmable gate array (FPGA). This paper shows that there exist very simple linear time-invariant (LTI) systems which can not be simulated on a Turing machine. In particular, this paper considers the linear system described by the voltage-current relation of an ideal capacitor. For this system, it is shown that there exist continuously differentiable and computable input signals such that the output signal is a continuous function which is not computable.

Categories:
28 Views

Word embeddings are shown to be a great asset for several Natural Language and Speech Processing tasks. While they are already evaluated on various NLP tasks, their evaluation on spoken or natural language understanding (SLU) is less studied. The goal of this study is two-fold: firstly, it focuses on semantic evaluation of common word embeddings approaches for SLU task; secondly, it investigates the use of two different data sets to train the embeddings: small and task-dependent corpus or huge and out-of-domain corpus.

Categories:
23 Views

A sensor (microphone) placement method based on mutual information for spatial sound field recording is proposed. The sound field recording methods using distributed sensors enable the estimation of the sound field inside a target region of arbitrary shape; however, it is a difficult task to find the best placement of sensors. We focus on the mutual-information-based sensor placement method in which spatial phenomena are modeled as a Gaussian process (GP).

Categories:
78 Views

Pages