ICASSP 2020

ICASSP is the world’s largest and most comprehensive technical conference focused on signal processing and its applications. The ICASSP 2020 conference will feature world-class presentations by internationally renowned speakers, cutting-edge session topics and provide a fantastic opportunity to network with like-minded professionals from around the world. Visit website.

Multi-Conditioning & Data Augmentation using Generative Noise Model for Speech Emotion Recognition in Noisy Conditions

Degradation due to additive noise is a significant road block in the real-life deployment of Speech Emotion Recognition (SER) systems. Most of the previous work in this field dealt with the noise degradation either at the signal or at the feature level. In this paper, to address the robustness aspect of the SER in additive noise scenarios, we propose multi-conditioning and data augmentation using an utterance level parametric generative noise model. The generative noise model is designed to generate noise types which can span the entire noise space in the mel-filterbank energy domain.

ICASSP2020_ppt_5701.pdf

ICASSP2020_ppt_5701.pdf (733)

Categories:: Speech Processing

96 Views

Efficient Techniques for In-band System Information Broadcast in Multi-cell Massive MIMO

ICASSP2020Paper2167Slide.pdf

Efficient Techniques for In-band System Information Broadcast in Multi-cell Massive MIMO (394)

Categories:: MIMO Communications and Signal Processing

23 Views

FULLY-NEURAL APPROACH TO HEAVY VEHICLE DETECTION ON BRIDGES USING A SINGLE STRAIN SENSOR

Bridge weigh-in-motion (BWIM) is a technique for detecting heavy vehicles that may cause serious damage to real bridges. BWIM is realized by analyzing the strain signals observed at places on the bridge in terms of bridge-component responses to the axle loads. In current practice, a BWIM system requires multiple strain sensors to collect vehicle properties including speed and axle positions for accurate load estimation, which may limit the system’s life-span.

icassp20s.pdf

icassp20s.pdf (375)

Categories:: Emerging DSP Applications

130 Views

Conditional Density Driven Grid Design in Point-Mass Filter

Read more about Conditional Density Driven Grid Design in Point-Mass Filter
1 comment
Log in to post comments

presentation.pdf

presentation.pdf (378)

Categories:: Signal and System Modeling, Representation and Estimation

23 Views

A REAL-TIME DEEP NETWORK FOR CROWD COUNTING

Read more about A REAL-TIME DEEP NETWORK FOR CROWD COUNTING
Log in to post comments

Automatic analysis of highly crowded people has attracted extensive attention from computer vision research. Previous approaches for crowd counting have already achieved promising performance across various benchmarks. However, to deal with the real situation, we hope the model run as fast as possible while keeping accuracy. In this paper, we propose a compact convolutional neural network for crowd counting which learns a more efficient model with a small number of parameters.

ICASSP_2020_XShi_PPT.pdf

ICASSP_2020_XShi_PPT.pdf (355)

Categories:: Image/Video Processing

17 Views

View-angle Invariant Object Monitoring Without Image Registration

Read more about View-angle Invariant Object Monitoring Without Image Registration
Log in to post comments

Object monitoring can be performed by change detection algorithms. However, for the image pair with a large perspective difference, the change detection performance is usually impacted by inaccurate image registration. To address the above difficulties, a novel object-specific change detection approach is proposed for object monitoring in this paper. In contrast to traditional approaches, the proposed approach is robust to view angle variation and does not require explicit image registration. Experiments demonstrate the effectiveness and advantages of the proposed approach.

View-angle Invariant Object Monitoring Without Image Registration .pdf

View-angle Invariant Object Monitoring Without Image Registration .pdf (371)

Categories:: Image/Video Processing

21 Views

IMPROVING THE PERFORMANCE OF TRANSFORMER BASED LOW RESOURCE SPEECH RECOGNITION FOR INDIAN LANGUAGES

The recent success of the Transformer based sequence-to-sequence framework for various Natural Language Processing tasks has motivated its application to Automatic Speech Recognition. In this work, we explore the application of Transformers on low resource Indian languages in a multilingual framework. We explore various methods to incorporate language information into a multilingual Transformer, i.e.,(i) at the decoder, (ii) at the encoder. These methods include using language identity tokens or providing language information to the acoustic vectors.

shetty.pdf

shetty.pdf (337)

Categories:: Image, Video, and Multidimensional Signal Processing

50 Views

WITCHcraft: Efficient PGD Attacks with Random Step Size

Read more about WITCHcraft: Efficient PGD Attacks with Random Step Size
Log in to post comments

WITCHcraft ICASSP 2020.pdf

WITCHcraft ICASSP 2020.pdf (389)

Categories:: Pattern recognition and classification (MLR-PATT)

19 Views

DEEP NEURAL NETWORKS BASED AUTOMATIC SPEECH RECOGNITION FOR FOUR ETHIOPIAN LANGUAGES

Read more about DEEP NEURAL NETWORKS BASED AUTOMATIC SPEECH RECOGNITION FOR FOUR ETHIOPIAN LANGUAGES
Log in to post comments

In this work, we present speech recognition systems for four Ethiopian languages: Amharic, Tigrigna, Oromo and Wolaytta. We have used comparable training corpora of about 20 to 29 hours speech and evaluation speech of about 1 hour for each of the languages. For Amharic and Tigrigna, lexical and language models of different vocabulary size have been developed. For Oromo and Wolaytta, the training lexicons have been used for decoding.