ICASSP is the world's largest and most comprehensive technical conference on signal processing and its applications. It provides a fantastic networking opportunity for like-minded professionals from around the world. ICASSP 2017 conference will feature world-class presentations by internationally renowned speakers and cutting-edge session topics. Visit ICASSP 2017
In this paper we focus on the usefulness of verbal events for speech based emotion recognition. In particular, the use of phoneme sequences to encode verbal cues related to the expression of emotions is proposed and lexical features based on these phoneme sequences are introduced for use in automatic emotion recognition systems where manual transcripts are not available. Secondly, a novel estimate of emotional salience of verbal cues, applicable to both phoneme sequences and words, is presented.
- Categories:
- Read more about OPTIMAL TRANSCEIVER DESIGN IN MULTI-USER MULTIPLE-INPUT MULTIPLE-OUTPUT WIRELESS NETWORKS
- Log in to post comments
- Categories:
- Read more about Decorrelation for Audio Object Coding
- Log in to post comments
Object-based representations of audio content are increasingly
used in entertainment systems to deliver immersive and
personalized experiences. Efficient storage and transmission
of such content can be achieved by joint object coding algorithms
that convey a reduced number of downmix signals
together with parametric side information that enables object
reconstruction in the decoder. This paper presents an
approach to improve the performance of joint object coding
by adding one or more decorrelators to the decoding process.
- Categories:
- Read more about FEATURE++: CROSS DIMENSION FEATURE FUSION FOR ROAD DETECTION
- Log in to post comments
Road detection is a key component of Advanced Driving Assistance Systems, which provides valid space and candidate regions of objects for vehicles. Mainstream road detection methods have focused on extracting discriminative features. In this paper, we propose a robust feature fusion framework, called “Feature++”, which is combined with superpixel feature and 3D feature extracted from stereo images. Then a neural network classifier is been trained to decide whether a superpixel is road region or not. Finally, the classified results are further refined by conditional random field.
poster_hwl.pdf
- Categories:
- Read more about FEATURE++: CROSS DIMENSION FEATURE FUSION FOR ROAD DETECTION
- Log in to post comments
Road detection is a key component of Advanced Driving Assistance Systems, which provides valid space and candidate regions of objects for vehicles. Mainstream road detection methods have focused on extracting discriminative features. In this paper, we propose a robust feature fusion framework, called “Feature++”, which is combined with superpixel feature and 3D feature extracted from stereo images. Then a neural network classifier is been trained to decide whether a superpixel is road region or not. Finally, the classified results are further refined by conditional random field.
poster_hwl.pdf
- Categories:
- Read more about Time of Arrival Disambiguation Using the Linear Radon Transform
- Log in to post comments
Echo labeling, the challenging task of assigning acoustic reflections to image sources, is equivalent to the highly-important disambiguation task in room geometry inference. A method using the Radon transform, an image processing tool, is proposed to address this challenge. The method relies on acoustic wavefront detection in room impulse response stacks, obtained with a uniform linear array of loudspeakers and one microphone. We show in our experiments that the proposed method can both label and detect echoes.
- Categories:
- Read more about Time-Domain Channel Estimation for Wideband Millimeter Wave Systems With Hybrid Architecture
- Log in to post comments
Millimeter wave (mmWave) systems will likely employ large antennas at both the transmitter and receiver for directional beamforming. Hybrid analog/digital MIMO architectures have been proposed previously for leveraging both array gain and multiplexing gain, while reducing the power consumption in analog-to-digital converters. Channel knowledge is needed to design the hybrid precoders/combiners, which is difficult to obtain due to the large antenna arrays and the frequency selective nature of the channel.
- Categories:
- Read more about Patch-based Multiple View Image Denoising with Occlusion Handling
- Log in to post comments
- Categories:
- Read more about AFFECT RECOGNITION FROM LIP ARTICULATIONS
- Log in to post comments
- Categories:
- Read more about PARTICLE PHD FILTER BASED MULTI-TARGET TRACKING USING DISCRIMINATIVE GROUP-STRUCTURED DICTIONARY LEARNING
- Log in to post comments
Structured sparse representation has been recently found to achieve better efficiency and robustness in exploiting the target appearance model in tracking systems with both holistic and local information. Therefore, to better simultaneously discriminate multi-targets from their background, we propose a novel video-based multi-target tracking system that combines the particle probability hypothesis density (PHD) filter with discriminative group-structured dictionary learning.
- Categories: