ICASSP 2019

ICASSP is the world’s largest and most comprehensive technical conference focused on signal processing and its applications. The 2019 conference will feature world-class presentations by internationally renowned speakers, cutting-edge session topics and provide a fantastic opportunity to network with like-minded professionals from around the world. Visit website.

Multipath Enabled Private Audio with Noise

Read more about Multipath Enabled Private Audio with Noise
Log in to post comments

We address the problem of privately communicating audio messages to multiple listeners in a reverberant room using a set of loudspeakers. We propose two methods based on emitting noise. In the first method, the loudspeakers emit noise signals that are appropriately filtered so that after echoing along multiple paths in the room, they sum up and descramble to yield distinct meaningful audio messages only at specific focusing spots, while being incoherent everywhere else.

icassp_poster_v3.pdf

icassp_poster_v3.pdf (456)

Categories:: Spatial and Multichannel Audio

12 Views

FROM TV-L1 TO GATED RECURRENT NETS

Read more about FROM TV-L1 TO GATED RECURRENT NETS
Log in to post comments

poster_3151.pdf

poster_3151.pdf (334)

Categories:: Image/Video Processing

5 Views

FROM TV-L1 TO GATED RECURRENT NETS

Read more about FROM TV-L1 TO GATED RECURRENT NETS
Log in to post comments

poster_3151.pdf

poster_3151.pdf (307)

Categories:: Image/Video Processing

4 Views

Towards Better Confidence Estimation for Neural Models

Read more about Towards Better Confidence Estimation for Neural Models
Log in to post comments

conference_poster_5.pdf

conference_poster_5.pdf (696)

Categories:: Spoken Language Processing

23 Views

A Deep Generative Model of Speech Complex Spectrograms

Read more about A Deep Generative Model of Speech Complex Spectrograms
Log in to post comments

This paper proposes an approach to the joint modeling of the short-time Fourier transform magnitude and phase spectrograms with a deep generative model. We assume that the magnitude follows a Gaussian distribution and the phase follows a von Mises distribution. To improve the consistency of the phase values in the time-frequency domain, we also apply the von Mises distribution to the phase derivatives, i.e., the group delay and the instantaneous frequency. Based on these assumptions, we explore and compare several combinations of loss functions for training our models.

_ICASSP_19POSTERA_deep_generative_model_of_speech_complex_spectrograms.pdf

[POSTER] A Deep Generative Model of Speech Complex Spectrograms (428)

Categories:: Audio and Acoustic Signal Processing

136 Views

Similarity Search-based Blind Source Separation

Read more about Similarity Search-based Blind Source Separation
Log in to post comments

In this paper, we propose a new method for blind source separation, where we perform similarity search for a prepared clean speech database. The purpose of this mechanism is to separate short utterances that we frequently encounter in a real-world situation. The new method employs a local Gaussian model (LGM) for the probability density functions of separated signals, and updates the LGM variance parameters by using the similarity search results.

Slide_ICASSP2019_sawada.pdf

Slide_ICASSP2019_sawada.pdf (2592)

Slide_ICASSP2019_sawada.pdf

Slide_ICASSP2019_sawada.pdf (440)

Categories:: Source Separation and Signal Enhancement

49 Views

BREAST CANCER DETECTION BASED ON MERGING FOUR MODES MRI USING CONVOLUTIONAL NEURAL NETWORKS

The objective of the study is to develop a framework for automatic breast cancer detection with merging four imaging modes. Attempts were made for tumor classification and segmentation; using a multi-parametric Magnetic Resonance Imaging (MRI) method on breast tumors. MRI data of the breast were obtained from 67 subjects with a 1.5T-MRI scanner. Four imaging modes: were T1 weighted, T2 weighted, Diffusion Weighted and eTHRIVE sequences, and dynamic- contrast-enhanced(DCE)-MRI parameters are acquired.

ICASSP2019-Jianguo Wei-paper5138.pptx

ICASSP2019-Jianguo Wei-paper5138.pptx (403)

Categories:: Audio and Acoustic Signal Processing

25 Views

Blind Quality Assessment for 3D-Synthesized Images by Measuring Geometric Distortions and Image Complexity

Free viewpoint video (FVV), owing to its comprehensive applications in immersive entertainment, remote surveillance and distanced education, has received extensive attention and been regarded as a new important direction of video technology development. Depth image-based rendering (DIBR) technologies are employed to synthesize FVV images in the “blind” environment. Therefore, a real-time reliable blind quality assessment metric is urgently required. However, existing stste-of-art quality assessment methods are limited to estimate geometric distortions generated by DIBR.

icassp-2019.pdf

poster (507)

Categories:: Image, Video, and Multidimensional Signal Processing

19 Views

Blind Quality Assessment for 3D-Synthesized Images by Measuring Geometric Distortions and Image Complexity

icassp-2019.pdf

poster (372)

Categories:: Image/Video Processing

18 Views

PPSAN: PERCEPTUAL-AWARE 3D POINT CLOUD SEGMENTATION VIA ADVERSARIAL LEARNING

Read more about PPSAN: PERCEPTUAL-AWARE 3D POINT CLOUD SEGMENTATION VIA ADVERSARIAL LEARNING
Log in to post comments

Point cloud segmentation is a key problem of 3D multimedia signal processing. Existing methods usually use single network structure which is trained by per-point loss. These methods mainly focus on geometric similarity between the prediction results and the ground truth, ignoring visual perception difference. In this paper, we present a segmentation adversarial network to overcome the drawbacks above. Discriminator is introduced to provide a perceptual loss to increase the rationality judgment of prediction and guide the further optimization of the segmentator.

ICASSP2019_Poster-lihy.pdf

ICASSP2019_Poster-lihy.pdf (690)

Categories:: Virtual reality and 3D imaging

33 Views

Pages