ICASSP 2019

ICASSP is the world’s largest and most comprehensive technical conference focused on signal processing and its applications. The 2019 conference will feature world-class presentations by internationally renowned speakers, cutting-edge session topics and provide a fantastic opportunity to network with like-minded professionals from around the world. Visit website.

Robust Room Equalization Using Sparse Sound-Field Reconstruction

Read more about Robust Room Equalization Using Sparse Sound-Field Reconstruction
Log in to post comments

MazurKatzberg_RobustRoomEqual.pdf

MazurKatzberg_RobustRoomEqual.pdf (369)

Categories:: Audio and Acoustic Signal Processing

14 Views

Fuzzy Personalized Scoring Model for Recommendation System

Read more about Fuzzy Personalized Scoring Model for Recommendation System
Log in to post comments

In this research, we aim to propose a data preprocessing framework particularly for financial sector to generate the rating data as input to the collaborative system. First, clustering technique is applied to cluster all users based on their demographic information which might be able to differentiate the customers’ background. Then, for each customer group, the importance of demographic characteristics which are highly associated with financial products purchasing are analyzed by the proposed fuzzy integral technique.

Poster_design_20190508_final.pdf

Poster_design_20190508_final.pdf (590)

Categories:: Knowledge and Data Engineering
Emerging: Big Data

14 Views

Evaluating Salience Representations for Cross-Modal Retrieval of Western Classical Music Recordings

In this contribution, we consider a cross-modal retrieval scenario of Western classical music. Given a short monophonic musical theme in symbolic notation as query, the objective is to find relevant audio recordings in a database. A major challenge of this retrieval task is the possible difference in the degree of polyphony between the monophonic query and the music recordings. Previous studies for popular music addressed this issue by performing the cross-modal comparison based on predominant melodies extracted from the recordings.

2019_ZalkowBM_SalienceThemeRetrieval_ICASSP.pdf

2019_ZalkowBM_SalienceThemeRetrieval_ICASSP.pdf (402)

2019_ZalkowBM_SalienceThemeRetrieval_ICASSP.pdf

2019_ZalkowBM_SalienceThemeRetrieval_ICASSP.pdf (336)

Categories:: Music Signal Processing

74 Views

View-Invariant Action Recognition From RGB Data via 3D Pose Estimation

Read more about View-Invariant Action Recognition From RGB Data via 3D Pose Estimation
Log in to post comments

In this paper, we propose a novel view-invariant action recognition method using a single monocular RGB camera. View-invariance remains a very challenging topic in 2D action recognition due to the lack of 3D information in RGB images. Most successful approaches make use of the concept of knowledge transfer by projecting 3D synthetic data to multiple viewpoints. Instead of relying on knowledge transfer, we propose to augment the RGB data by a third dimension by means of 3D skeleton estimation from 2D images using a CNN-based pose estimator.

ICASSP_Renato_final.pdf

ICASSP_Renato_final.pdf (343)

Categories:: Image/Video Processing

15 Views

Poster of pixel level data augmentation for semantic image segmentation using generative adversarial networks

Semantic segmentation is one of the basic topics in computer vision, it aims to assign semantic labels to every pixel of an image. Unbalanced semantic label distribution could have a negative inﬂuence on segmentation accuracy. In this paper, we investigate using data augmentation approach to balance the label distribution in order to improve segmentation performance. We propose using generative adversarial networks (GANs) to generate realistic images for improving the performance of semantic segmentation networks.

ICASSP2019.pdf

poster of the paper (403)

Categories:: Image/Video Processing

131 Views

Complex Neural Beamforming

Read more about Complex Neural Beamforming
Log in to post comments

icassp2019talk.pdf

icassp2019talk.pdf (386)

Categories:: Spatial and Multichannel Audio

12 Views

Robust gridless sound field decomposition based on structured eciprocity gap functional in spherical harmonic domain

A sound field reconstruction method for a region including sources is proposed. Under the assumption of spatial sparsity of the sources,this reconstruction problem has been solved by using sparse decomposition algorithms with the discretization of the target region. Since this discretization leads to the off-grid problem, we previously proposed a gridless sound field decomposition method based on the reciprocity gap functional in the spherical harmonic domain.

main.pdf

main.pdf (400)

Categories:: Loudspeaker and Microphone Array Signal Processing
Spatial and Multichannel Audio

27 Views

Global and Local Mode-domain Adaptive Algorithms for Spatial Active Noise Control Using Higher-order Sources

The aim of spatial active noise control (ANC) is to attenuate noise over a certain space. Although a large-scale system is required to
achieve spatial ANC, mode-domain signal processing makes it possible to reduce the computational cost and improve the performance.
A higher-order source (HOS) has an advantage in sound field control due to its controllable directivity patterns. An array of HOS
can suppress an undesired exterior sound propagation while occupying a smaller physical space than a conventional omnidirectional

190408_ICASSP2019_poster_murata_sigport.pdf

190408_ICASSP2019_poster_murata_sigport.pdf (599)

Categories:: Active Noise Control
Spatial and Multichannel Audio

60 Views

Fast Implementation of Double-coupled Nonnegative Canonical Polyadic Decomposition

Read more about Fast Implementation of Double-coupled Nonnegative Canonical Polyadic Decomposition
Log in to post comments

Real-world data exhibiting high order/dimensionality and various couplings are linked to each other since they share
some common characteristics. Coupled tensor decomposition has become a popular technique for group analysis in recent
years, especially for simultaneous analysis of multi-block tensor data with common information. To address the multi-
block tensor data, we propose a fast double-coupled nonnegative Canonical Polyadic Decomposition (FDC-NCPD)

Poster_Xiulin Wang.pdf

Poster_Xiulin Wang.pdf (485)

Categories:: Machine Learning for Signal Processing

11 Views

Models of visually grounded speech signal pay attention to nouns: a bilingual experiment on English and Japanese

We investigate the behaviour of attention in neural models of visually grounded speech trained on two languages: English and Japanese. Experimental results show that attention focuses on nouns and this behaviour holds true for two very typologically different languages. We also draw parallels between artificial neural attention and human attention and show that neural attention focuses on word endings as it has been theorised for human attention. Finally, we investigate how two visually grounded monolingual models can be used to perform cross-lingual speech-to-speech retrieval.

POSTER_ICASSP.pdf

POSTER_ICASSP.pdf (627)

ARTICLE_ICASSP2019.pdf

ARTICLE_ICASSP2019.pdf (447)

Categories:: Human Spoken Language Acquisition, Development and Learning (SLP-LADL)

14 Views

Pages