Sorry, you need to enable JavaScript to visit this website.

ICASSP is the world’s largest and most comprehensive technical conference focused on signal processing and its applications. The 2019 conference will feature world-class presentations by internationally renowned speakers, cutting-edge session topics and provide a fantastic opportunity to network with like-minded professionals from around the world. Visit website.

We propose an adaptive visual target tracking algorithm based on Label-Consistent K-Singular Value Decomposition (LC-KSVD) dictionary learning. To construct target templates, local patch features are sampled from foreground and background of the target. LC-KSVD then is applied to these local patches to simultaneously estimate a set of low-dimension dictionary and classification parameters (CP). To track the target over time, a kernel particle filter (KPF) is proposed that integrates both local and global motion information of the target.

Categories:
18 Views

In this work, we investigate mapping both natural language food and quantity descriptions to matching USDA database entries. We demonstrate that a convolutional neural network (CNN) model with a softmax layer on top to directly predict the most likely database matches outperforms our previous state-of-the-art approach of learning binary classification and subsequently ranking database entries using similarity scores with the learned embeddings.

Categories:
7 Views

Video enhancement methods enable to optimize the viewing of video content at the end-user side. Most approaches do not consider the compressed nature of the available content. In the present work, we build upon a recently proposed video enhancement approach that explicitly models a compression stage. To apply the enhancement framework on compressed representations requires to extract specific syntax elements during their decoding. This additional information embeds the enhanced result in a domain that closely fits the observation.

Categories:
30 Views

Vanishing long-term gradients are a major issue in training standard recurrent neural networks (RNNs), which can be alleviated by long short-term memory (LSTM) models with memory cells. However, the extra parameters associated with the memory cells mean an LSTM layer has four times as many parameters as an RNN with the same hidden vector size. This paper addresses the vanishing gradient problem using a high order RNN (HORNN) which has additional connections from multiple previous time steps.

Categories:
5 Views

Negative symptoms of schizophrenia are often associated with the blunting of emotional affect which creates a serious impediment in the daily functioning of the patients. Affective prosody is almost always adversely impacted in such cases, and is known to exhibit itself through the low-level acoustic signals of prosody. To automate and simplify the process of assessment of severity of emotion related symptoms of schizophrenia, we utilized these low-level acoustic signals to predict the expert subjective ratings assigned by a trained psychologist during an interview with the patient.

Categories:
18 Views

The problem of sequential multiple hypothesis testing in a distributed sensor network is considered and two algorithms are proposed: the Consensus + Innovations Matrix Sequential Probability Ratio Test (CIMSPRT) for multiple simple hypotheses and the robust Least-Favorable-Density-CIMSPRT for hypotheses with uncertainties in the corresponding distributions. Simulations are performed to verify and evaluate the performance of both algorithms under different network conditions and noise contaminations.

Categories:
5 Views

Speech emotion recognition is important to understand users' intention in human-computer interaction. However, it is a challenging task partly because we cannot clearly know which feature and model are effective to distinguish emotions. Previous studies utilize convolutional neural network (CNN) directly on spectrograms to extract features, and bidirectional long short term memory (BLSTM) is the state-of-the-art model. However, there are two problems of CNN-BLSTM. Firstly, it doesn't utilize heuristic features based on priori knowledge.

Categories:
201 Views

For about 10 years, detecting the presence of a secret message hidden
in an image was performed with an Ensemble Classifier trained
with Rich features. In recent years, studies such as Xu et al. have
indicated that well-designed Convolutional Neural Networks(CNN)
can achieve comparable performance to the two-step machine learning
approaches.
In this paper we propose a CNN that outperforms the state-ofthe-
art in terms of error probability. The proposition is in the continuity
of what has been recently proposed and it is a clever fusion

Categories:
9 Views

Pages