ICASSP 2019

ICASSP is the world’s largest and most comprehensive technical conference focused on signal processing and its applications. The 2019 conference will feature world-class presentations by internationally renowned speakers, cutting-edge session topics and provide a fantastic opportunity to network with like-minded professionals from around the world. Visit website.

RHFCN: Fully CNN-based Steganalysis of MP3 with Rich High-Pass Filtering

Read more about RHFCN: Fully CNN-based Steganalysis of MP3 with Rich High-Pass Filtering
Log in to post comments

Recent studies have shown that convolutional neural networks (CNNs) can boost the performance of audio steganalysis. In this paper, we propose a well-designed fully CNN architecture for MP3 steganalysis based on rich high-pass filtering (HPF). On the one hand, multi-type HPFs are employed for "residual" extraction to enlarge the traces of the signal in view of the truth that signal introduced by secret messages can be seen as high-pass frequency noise.

poster.pdf

poster.pdf (462)

Categories:: Watermarking and Steganography

81 Views

ANALYSIS OF COPRIME ARRAYS ON MOVING PLATFORM

Read more about ANALYSIS OF COPRIME ARRAYS ON MOVING PLATFORM
Log in to post comments

Moving platforms enable sparse arrays to assume higher degrees of freedom and lead to increased number of lags. In essence, array motion can fill the holes in the spatial autocorrelation lags associated with a fixed platform and, therefore, increase the number of sources detectable by the same number of physical array sensors. In this paper, we consider coprime arrays, and assume quasi-stationarity of the environment, where the source locations and waveforms are considered invariant over array motion of half wavelength.

Slides_ANALYSIS OF COPRIME ARRAYS ON MOVING PLATFORM.pdf

Slides_ANALYSIS OF COPRIME ARRAYS ON MOVING PLATFORM.pdf (813)

Categories:: Sensor Array and Multichannel Signal Processing

48 Views

Low-complexity Recurrent Neural Network-based Polar Decoder with Weight Quantization Mechanism

Polar codes have drawn much attention and been adopted in 5G New Radio (NR) due to their capacity-achieving performance. Recently, as the emerging deep learning (DL) technique has breakthrough achievements in many fields, neural network decoder was proposed to obtain faster convergence and better performance than belief propagation (BP) decoding. However, neural networks are memory-intensive and hinder the deployment of DL in communication systems. In this work, a low-complexity recurrent neural network (RNN) polar decoder with codebook-based weight quantization is proposed.

2019 ICASSP Oral.pdf

Presentation Slides (482)

Categories:: Algorithm and architecture co-optimization

13 Views

HOW VIDEO OBJECT TRACKING IS AFFECTED BY IN-CAPTURE DISTORTIONS?

Read more about HOW VIDEO OBJECT TRACKING IS AFFECTED BY IN-CAPTURE DISTORTIONS?
Log in to post comments

Video Object Tracking -VOT- in realistic scenarios is a difficult task. Image factors such as occlusion, clutter, confusion, object shape, and zooming, among others, have an impact on video tracker methods performance. While these conditions do affect trackers performance, there is not a clear distinction between the scene content challenges like occlusion and clutter, against challenges due to distortions generated by capture, compression, processing, and transmission of videos. This paper is concerned with the latter interpretation of quality as it affects VOT performance.

4127_HOW_VIDEO_OBJECT_TRACKING_IS_AFFECTED_BY_.pdf

ObjectTracking_4127 (364)

Categories:: Image/Video Processing

42 Views

SOUND SOURCE LOCALIZATION IN A REVERBERANT ROOM USING HARMONIC BASED MUSIC

Read more about SOUND SOURCE LOCALIZATION IN A REVERBERANT ROOM USING HARMONIC BASED MUSIC
Log in to post comments

The localization of acoustic sound sources is beneficial to signal processing applications of speech enhancement, dereverberation, separation and tracking. Difficulties in position estimation arise in real world environments due to coherent reflections degrading performance of subspace localization techniques. This paper proposes a method of multiple signal classification (MUSIC) subspace localization, which is suitable for reverberant rooms. The method is based on the modal decomposition of a room's region-to-region transfer function, which is assumed to be known.

u5351515_lachlan_birnie_ICASSP2019_Poster.pdf

u5351515_lachlan_birnie_ICASSP2019_Poster.pdf (539)

Categories:: Room Acoustics and Acoustic System Modeling

35 Views

Learning Shared Vector Representations of Lyrics and Chords in Music

Read more about Learning Shared Vector Representations of Lyrics and Chords in Music
Log in to post comments

Music has a powerful influence on a listener's emotions. In this paper, we represent lyrics and chords in a shared vector space using a phrase-aligned chord-and-lyrics corpus. We show that models that use these shared representations predict a listener's emotion while hearing musical passages better than models that do not use these representations. Additionally, we conduct a visual analysis of these learnt shared vector representations and explain how they support existing theories in music.

Learning_Shared_Reps_ICASSP_Pres_2(1).pdf

Learning_Shared_Reps_ICASSP_Pres_2(1).pdf (814)

Categories:: Multimodal signal processing

68 Views

Time domain Spherical harmonic analysis for adaptive noise cancellation over a spatial region

Active Noise Cancellation (ANC) is a well researched topic for minimizing unwanted acoustic noise, and spatial ANC is a recently introduced concept that focuses on continuous spatial regions. Adaptive filter designing for spatial ANC is often based on frequency-domain spherical harmonic decomposition method, which has a major limitation due to the increased system latency. In this paper, we develop a time-domain spherical harmonic based signal decomposition method and use it to develop two time-space domain feed-forward adaptive filters for spatial ANC.

ICASSP_POSTER_decided.pdf

ICASSP_POSTER_decided.pdf (452)

Categories:: Active Noise Control

25 Views

A NEURAL NETWORK BASED RANKING FRAMEWORK TO IMPROVE ASR WITH NLU RELATED KNOWLEDGE DEPLOYED

This work proposes a new neural network framework to simultaneously rank multiple hypotheses generated by one or more automatic speech recognition (ASR) engines for a speech utterance. Features fed in the framework not only include those calculated from the ASR information, but also involve natural language understanding (NLU) related features, such as trigger features capturing long-distance constraints between word/slot pairs and BLSTM features representing intent-sensitive sentence embedding.

Poster_ICASSP2019.pdf

Poster_ICASSP2019.pdf (689)

Categories:: Large Vocabulary Continuous Recognition/Search (SPE-LVCR)

88 Views

Variational and Hierarchical Recurrent Autoencoder

Read more about Variational and Hierarchical Recurrent Autoencoder
Log in to post comments

Despite a great success in learning representation for image data, it is challenging to learn the stochastic latent features from natural language based on variational inference. The difficulty in stochastic sequential learning is due to the posterior collapse caused by an autoregressive decoder which is prone to be too strong to learn sufficient latent information during optimization. To compensate this weakness in learning procedure, a sophisticated latent structure is required to assure good convergence so that random features are sufficiently captured for sequential decoding.

[ICASSP 2019] Variational and hierarchical recurrent autoencoder.pdf

[ICASSP 2019] Variational and hierarchical recurrent autoencoder.pdf (567)

Icassp19_hier.pdf

Icassp19_hier.pdf (443)

Categories:: Sequential learning; sequential decision methods (MLR-SLER)
Bayesian learning; Bayesian signal processing (MLR-BAYL)

27 Views

COMPACT CONVOLUTIONAL RECURRENT NEURAL NETWORKS VIA BINARIZATION FOR SPEECH EMOTION RECOGNITION

Despite the great advances, most of the recently developed automatic speech recognition systems focus on working in a server-client manner, and thus often require a high computational cost, such as the storage size and memory accesses. This, however, does not satisfy the increasing demand for a succinct model that can run smoothly in embedded devices like smartphones.

ICASSP19005.pdf

ICASSP19005.pdf (449)

Categories:: Audio and Acoustic Signal Processing

28 Views

Pages