Sorry, you need to enable JavaScript to visit this website.

ICASSP is the world's largest and most comprehensive technical conference on signal processing and its applications. It provides a fantastic networking opportunity for like-minded professionals from around the world. ICASSP 2017 conference will feature world-class presentations by internationally renowned speakers and cutting-edge session topics. Visit ICASSP 2017

Summarization of videos depicting human activities is a timely problem with important applications, e.g., in the domains of surveillance or film/TV production, that steadily becomes more relevant. Research on video summarization has mainly relied on global clustering or local (frame-by-frame) saliency methods to provide automated algorithmic

Categories:
1 Views

Summarization of videos depicting human activities is a timely problem with important applications, e.g., in the domains of surveillance or film/TV production, that steadily becomes more relevant. Research on video summarization has mainly relied on global clustering or local (frame-by-frame) saliency methods to provide automated algorithmic

Categories:
9 Views

This study introduces a machine hearing system for robot audition, which enables a robotic agent to pro-actively minimize the uncertainty of sound source location estimates through motion. The proposed system is based on an active exploration approach, providing a means to model and predict effects of the agent's future motions on localization uncertainty in a probabilistic manner. Particle filtering is used to estimate the posterior probability density function of the source position from binaural measurements, enabling to jointly assess azimuth and distance of the source.

Categories:
6 Views

This paper presents supervised feature learning approaches for speaker identification that rely on nonnegative matrix factorisation. Recent studies have shown that group nonnegative matrix factorisation and task-driven supervised dictionary learning can help performing effective feature learning for audio classification problems.

Categories:
4 Views

High Resolution Envelope Processing (HREP) is a new tool for improved perceptual coding of audio signals that predominantly consist of many dense transient events, such as applause, rain drop sounds, etc. These signals have traditionally been very difficult to code for perceptual audio codecs, particularly at low bit rates. Based on the gain control principle, HREP acts as a pre-/post-processor pair to perceptual audio codecs and preserves the temporal fine structure and subjective quality of applause-like signals.

Categories:
60 Views

This paper proposes a reliable 3D fish tracking method using a novel master-slave camera setup. Instead of conventional dynamic models that rely on prior knowledge about target kinematics, the proposed method learns the kinematic model with a Long Short-Term Memory (LSTM) network. On this basis, the 3D state of fish at each moment is predicted by LSTM network. We propose to use an innovative master-view-tracking-first strategy. The fish are first tracked in the master view. Cross-view association is then established utilizing motion continuity and epipolar constraint cues.

Categories:
1 Views

Directional block transforms (DBTs), such as discrete Fourier transforms, are basically less efficient for sparse image representation
than directional overlapped transforms, such as curvelet and contourlet, but have advantages in practical computation, such as less computational cost, less amount of memory usage to be used, and parallel processing. In order to realize efficient DBTs, this paper proposes

Categories:
23 Views

Pages