ICASSP 2020

ICASSP is the world’s largest and most comprehensive technical conference focused on signal processing and its applications. The ICASSP 2020 conference will feature world-class presentations by internationally renowned speakers, cutting-edge session topics and provide a fantastic opportunity to network with like-minded professionals from around the world. Visit website.

Efficient Scene Text Detection with Textual Attention Tower

Read more about Efficient Scene Text Detection with Textual Attention Tower
Log in to post comments

Scene text detection has received attention for years and achieved an impressive performance across various benchmarks. In this work, we propose an efficient and accurate approach to detect multi-oriented text in scene images. The proposed feature fusion mechanism allows us to use a shallower network to reduce the computational complexity. A self-attention mechanism is adopted to suppress false positive detections.

icassp_presentation.pdf

icassp_presentation.pdf (368)

Categories:: Pattern recognition and classification (MLR-PATT)

17 Views

QUANTIZED TENSOR ROBUST PRINCIPAL COMPONENT ANALYSIS

Read more about QUANTIZED TENSOR ROBUST PRINCIPAL COMPONENT ANALYSIS
Log in to post comments

High-dimensional data structures, known as tensors, are fundamental in many applications, including multispectral imaging and color video processing. Compression of such huge amount of multidimensional data collected over time is of paramount importance, necessitating the process of quantization of measurements into discrete values. Furthermore, noise and issues related to the acquisition and transmission of signals frequently lead to unobserved, lost or corrupted measurements.

QTRPCA_presentation.pdf

QTRPCA_presentation.pdf (401)

Categories:: Image/Video Storage, Retrieval

22 Views

Presentation: Trapezoidal Segment Sequencing: A Novel Approach for Fusion of Human-produced Continuous Annotations

Generating accurate ground truth representations of human subjective experiences and judgements is essential for advancing our understanding of human-centered constructs such as emotions. Often, this requires the collection and fusion of annotations from several people where each one is subject to valuation disagreements, distraction artifacts, and other error sources.

_Presentation__2020_ICASSP_Trapezoidal_Segment_Fusion.pdf

_Presentation__2020_ICASSP_Trapezoidal_Segment_Fusion.pdf (561)

Categories:: Multimedia Signal Processing

54 Views

DEEP MATRIX COMPLETION ON GRAPHS: APPLICATION IN DRUG TARGET INTERACTION PREDICTION

Read more about DEEP MATRIX COMPLETION ON GRAPHS: APPLICATION IN DRUG TARGET INTERACTION PREDICTION
Log in to post comments

ICASSP 2020.pdf

ICASSP 2020.pdf (491)

Categories:: Bioinformatics

9 Views

Reconstruction of FRI Signals Using Deep Neural Network Approaches

Read more about Reconstruction of FRI Signals Using Deep Neural Network Approaches
Log in to post comments

Finite Rate of Innovation (FRI) theory considers sampling and reconstruction of classes of non-bandlimited continuous signals that have a small number of free parameters, such as a stream of Diracs. The task of reconstructing FRI signals from discrete samples is often transformed into a spectral estimation problem and solved using Prony's method and matrix pencil method which involve estimating signal subspaces. They achieve an optimal performance given by the Cramér-Rao bound yet break down at a certain peak signal-to-noise ratio (PSNR).

ICASSP2020_Slides_final.pdf

ICASSP2020_Slides_final.pdf (485)

Categories:: Other applications of machine learning (MLR-APPL)

21 Views

Generalized Coherence-based Signal Enhancement

Read more about Generalized Coherence-based Signal Enhancement
Log in to post comments

This contribution presents a novel approach for coherence-based signal enhancement. An estimator for the coherent-to-diffuse ratio (CDR) is devised, which exploits the concept of generalized magnitude coherence and thus, unlike common state-of-the-art schemes, can simultaneously take advantage of more than two microphones. Moreover, the speech enhancement by CDR-based spectral weighting is not performed as a post-filtering step, but by enhancing the most appropriate microphone signal.

ICASSP_2020_presentation_Loellmann_final.pdf

ICASSP 2020 Presentation of H. Loellmann (372)

Categories:: Source Separation and Signal Enhancement

32 Views

Can every analog system be simulated on a digital computer?

Read more about Can every analog system be simulated on a digital computer?
Log in to post comments

A Turing machine is a model describing the fundamental limits of any realizable computer, digital signal processor (DSP), or field programmable gate array (FPGA). This paper shows that there exist very simple linear time-invariant (LTI) systems which can not be simulated on a Turing machine. In particular, this paper considers the linear system described by the voltage-current relation of an ideal capacitor. For this system, it is shown that there exist continuously differentiable and computable input signals such that the output signal is a continuous function which is not computable.

BoPo_ICASSP2597.pdf

BoPo_ICASSP2597.pdf (617)

Categories:: Sampling and Reconstruction
Design and Implementation of Signal Processing Systems

36 Views

Back-to-Back Butterfly Network, an Adaptive Permutation Network for New Communication Standards

ICASSP_poster_Barcelona_2020_v4.pdf

ICASSP_poster_Barcelona_2020_v4.pdf (512)

Categories:: Algorithm and architecture co-optimization

9 Views

WHAT IS BEST FOR SPOKEN LANGUAGE UNDERSTANDING: SMALL BUT TASK-DEPENDANT EMBEDDINGS OR HUGE BUT OUT-OF-DOMAIN EMBEDDINGS?

Word embeddings are shown to be a great asset for several Natural Language and Speech Processing tasks. While they are already evaluated on various NLP tasks, their evaluation on spoken or natural language understanding (SLU) is less studied. The goal of this study is two-fold: firstly, it focuses on semantic evaluation of common word embeddings approaches for SLU task; secondly, it investigates the use of two different data sets to train the embeddings: small and task-dependent corpus or huge and out-of-domain corpus.

Slides_paper_number_ 3443.pdf

Slides_paper_number_ 3443.pdf (421)

Categories:: Spoken Language Understanding (SLP-UNDE)

24 Views

Mutual-Information-Based Sensor Placement for Spatial Sound Field Recording

Read more about Mutual-Information-Based Sensor Placement for Spatial Sound Field Recording
Log in to post comments

A sensor (microphone) placement method based on mutual information for spatial sound field recording is proposed. The sound field recording methods using distributed sensors enable the estimation of the sound field inside a target region of arbitrary shape; however, it is a difficult task to find the best placement of sensors. We focus on the mutual-information-based sensor placement method in which spatial phenomena are modeled as a Gaussian process (GP).

ICASSP2020_Nishida.pdf

ICASSP2020_Nishida.pdf (401)

Categories:: Spatial and Multichannel Audio

79 Views

Pages