ICASSP 2018

ICASSP is the world’s largest and most comprehensive technical conference focused on signal processing and its applications. The 2019 conference will feature world-class presentations by internationally renowned speakers, cutting-edge session topics and provide a fantastic opportunity to network with like-minded professionals from around the world. Visit website.

ADAPTIVE VISUAL TARGET TRACKING BASED ON LABEL CONSISTENT K-SVD SPARSE CODING AND KERNEL PARTICLE FILTER

We propose an adaptive visual target tracking algorithm based on Label-Consistent K-Singular Value Decomposition (LC-KSVD) dictionary learning. To construct target templates, local patch features are sampled from foreground and background of the target. LC-KSVD then is applied to these local patches to simultaneously estimate a set of low-dimension dictionary and classification parameters (CP). To track the target over time, a kernel particle filter (KPF) is proposed that integrates both local and global motion information of the target.

Poster-2119.pdf

Poster-2119.pdf (446)

Categories:: Image, Video, and Multidimensional Signal Processing

20 Views

CONVOLUTIONAL NEURAL NETWORKS AND MULTITASK STRATEGIES FOR SEMANTIC MAPPING OF NATURAL LANGUAGE INPUT TO A STRUCTURED DATABASE

In this work, we investigate mapping both natural language food and quantity descriptions to matching USDA database entries. We demonstrate that a convolutional neural network (CNN) model with a softmax layer on top to directly predict the most likely database matches outperforms our previous state-of-the-art approach of learning binary classification and subsequently ranking database entries using similarity scores with the learned embeddings.

icassp_2018.pdf

icassp_2018.pdf (364)

Categories:: Spoken Language Understanding (SLP-UNDE)
Spoken and Multimodal Dialog Systems and Applications (SLP-SMMD)

11 Views

VIDEO ENHANCEMENT WITH CONVEX OPTIMIZATION METHODS

Read more about VIDEO ENHANCEMENT WITH CONVEX OPTIMIZATION METHODS
Log in to post comments

Video enhancement methods enable to optimize the viewing of video content at the end-user side. Most approaches do not consider the compressed nature of the available content. In the present work, we build upon a recently proposed video enhancement approach that explicitly models a compression stage. To apply the enhancement framework on compressed representations requires to extract specific syntax elements during their decoding. This additional information embeds the enhanced result in a domain that closely fits the observation.

Poster_ICASSP_2018.pdf

poster presentation for video enhancement with convex optimization methods (994)

Categories:: Multimedia and DTV Technologies

32 Views

High Order Recurrent Neural Networks for Acoustic Modelling

Read more about High Order Recurrent Neural Networks for Acoustic Modelling
Log in to post comments

Vanishing long-term gradients are a major issue in training standard recurrent neural networks (RNNs), which can be alleviated by long short-term memory (LSTM) models with memory cells. However, the extra parameters associated with the memory cells mean an LSTM layer has four times as many parameters as an RNN with the same hidden vector size. This paper addresses the vanishing gradient problem using a high order RNN (HORNN) which has additional connections from multiple previous time steps.

cz277-ICASSP18-Poster-v3.pdf

cz277-ICASSP18-Poster-v3.pdf (768)

Categories:: Acoustic Modeling for Automatic Speech Recognition (SPE-RECO)
Spoken Language Processing

14 Views

Prediction of Negative Symptoms of Schizophrenia from Emotion Related Low-Level Speech Signals

Negative symptoms of schizophrenia are often associated with the blunting of emotional affect which creates a serious impediment in the daily functioning of the patients. Affective prosody is almost always adversely impacted in such cases, and is known to exhibit itself through the low-level acoustic signals of prosody. To automate and simplify the process of assessment of severity of emotion related symptoms of schizophrenia, we utilized these low-level acoustic signals to predict the expert subjective ratings assigned by a trained psychologist during an interview with the patient.

ICASSP_Poster.pdf

ICASSP_Poster.pdf (1674)

Categories:: Speech Analysis (SPE-ANLS)

20 Views

ROBUST SEQUENTIAL TESTING OF MULTIPLE HYPOTHESES IN DISTRIBUTED SENSOR NETWORKS

Read more about ROBUST SEQUENTIAL TESTING OF MULTIPLE HYPOTHESES IN DISTRIBUTED SENSOR NETWORKS
Log in to post comments

The problem of sequential multiple hypothesis testing in a distributed sensor network is considered and two algorithms are proposed: the Consensus + Innovations Matrix Sequential Probability Ratio Test (CIMSPRT) for multiple simple hypotheses and the robust Least-Favorable-Density-CIMSPRT for hypotheses with uncertainties in the corresponding distributions. Simulations are performed to verify and evaluate the performance of both algorithms under different network conditions and noise contaminations.

icassp2018_leonard_poster_final.pdf

Poster: ROBUST SEQUENTIAL TESTING OF MULTIPLE HYPOTHESES IN DISTRIBUTED SENSOR NETWORKS (674)

Categories:: Statistical Signal Processing

12 Views

A feature fusion method based on extreme learning machine for speech emotion recognition

Speech emotion recognition is important to understand users' intention in human-computer interaction. However, it is a challenging task partly because we cannot clearly know which feature and model are effective to distinguish emotions. Previous studies utilize convolutional neural network (CNN) directly on spectrograms to extract features, and bidirectional long short term memory (BLSTM) is the state-of-the-art model. However, there are two problems of CNN-BLSTM. Firstly, it doesn't utilize heuristic features based on priori knowledge.

poster-guolili.pdf

poster-guolili.pdf (639)

Categories:: Other applications of machine learning (MLR-APPL)

202 Views

MEASURING UNCERTAINTY IN DEEP REGRESSION MODELS: THE CASE OF AGE ESTIMATION FROM SPEECH

icassp.pptx

icassp.pptx (497)

icassp.pptx

icassp.pptx (541)

Categories:: Speech Analysis (SPE-ANLS)

23 Views

Yedrouj-Net: An efficient CNN for spatial steganalysis Results Conclusions • An efficient approach based on deep learning (CNN) for steganalysis. • Our method outperforms the state-of-the-art and others CNN-based models with and without taking extra measu

For about 10 years, detecting the presence of a secret message hidden
in an image was performed with an Ensemble Classifier trained
with Rich features. In recent years, studies such as Xu et al. have
indicated that well-designed Convolutional Neural Networks(CNN)
can achieve comparable performance to the two-step machine learning
approaches.
In this paper we propose a CNN that outperforms the state-ofthe-
art in terms of error probability. The proposition is in the continuity
of what has been recently proposed and it is a clever fusion