Sorry, you need to enable JavaScript to visit this website.

ICASSP is the world's largest and most comprehensive technical conference on signal processing and its applications. It provides a fantastic networking opportunity for like-minded professionals from around the world. ICASSP 2017 conference will feature world-class presentations by internationally renowned speakers and cutting-edge session topics. Visit ICASSP 2017

In this paper we focus on the usefulness of verbal events for speech based emotion recognition. In particular, the use of phoneme sequences to encode verbal cues related to the expression of emotions is proposed and lexical features based on these phoneme sequences are introduced for use in automatic emotion recognition systems where manual transcripts are not available. Secondly, a novel estimate of emotional salience of verbal cues, applicable to both phoneme sequences and words, is presented.

Categories:
2 Views

Object-based representations of audio content are increasingly
used in entertainment systems to deliver immersive and
personalized experiences. Efficient storage and transmission
of such content can be achieved by joint object coding algorithms
that convey a reduced number of downmix signals
together with parametric side information that enables object
reconstruction in the decoder. This paper presents an
approach to improve the performance of joint object coding
by adding one or more decorrelators to the decoding process.

Categories:
30 Views

Road detection is a key component of Advanced Driving Assistance Systems, which provides valid space and candidate regions of objects for vehicles. Mainstream road detection methods have focused on extracting discriminative features. In this paper, we propose a robust feature fusion framework, called “Feature++”, which is combined with superpixel feature and 3D feature extracted from stereo images. Then a neural network classifier is been trained to decide whether a superpixel is road region or not. Finally, the classified results are further refined by conditional random field.

Categories:
2 Views

Road detection is a key component of Advanced Driving Assistance Systems, which provides valid space and candidate regions of objects for vehicles. Mainstream road detection methods have focused on extracting discriminative features. In this paper, we propose a robust feature fusion framework, called “Feature++”, which is combined with superpixel feature and 3D feature extracted from stereo images. Then a neural network classifier is been trained to decide whether a superpixel is road region or not. Finally, the classified results are further refined by conditional random field.

Categories:
4 Views

Echo labeling, the challenging task of assigning acoustic reflections to image sources, is equivalent to the highly-important disambiguation task in room geometry inference. A method using the Radon transform, an image processing tool, is proposed to address this challenge. The method relies on acoustic wavefront detection in room impulse response stacks, obtained with a uniform linear array of loudspeakers and one microphone. We show in our experiments that the proposed method can both label and detect echoes.

Categories:
5 Views

Millimeter wave (mmWave) systems will likely employ large antennas at both the transmitter and receiver for directional beamforming. Hybrid analog/digital MIMO architectures have been proposed previously for leveraging both array gain and multiplexing gain, while reducing the power consumption in analog-to-digital converters. Channel knowledge is needed to design the hybrid precoders/combiners, which is difficult to obtain due to the large antenna arrays and the frequency selective nature of the channel.

Categories:
37 Views

Structured sparse representation has been recently found to achieve better efficiency and robustness in exploiting the target appearance model in tracking systems with both holistic and local information. Therefore, to better simultaneously discriminate multi-targets from their background, we propose a novel video-based multi-target tracking system that combines the particle probability hypothesis density (PHD) filter with discriminative group-structured dictionary learning.

Categories:
3 Views

Pages