Sorry, you need to enable JavaScript to visit this website.

Accurate segmentation of humans from live videos is an important problem to be solved in developing immersive video experience. We propose to extract the human segmentation information from color and depth cues in a video using multiple modeling techniques. The prior information from human skeleton data is also fused along with the depth and color models to obtain the final segmentation inside a graph-cut framework. The proposed method runs real time on live videos using single CPU and is shown to be quantitatively outperforming the methods that directly fuse color and depth data.

Categories:
75 Views

Group sparsity or nonlocal image representation has shown great potential in image denoising. However, most existing methods only consider the nonlocal self-similarity (NSS) prior of noisy input image, that is, the similar patches collected only from degraded input, which makes the quality of image denoising largely depend on the input itself. In this paper we propose a new prior model for image denoising, called group sparsity residual constraint (GSRC).

Categories:
2 Views

The movement of tongue plays an important role in pronunciation. Visualizing the movement of tongue can improve speech intelligibility and also helps learning a second language. However, hardly any research has been investigated for this topic. In this paper, a framework to synthesize continuous ultrasound tongue movement video from speech is presented. Two different mapping methods are introduced as the most important parts of the framework.

Categories:
2 Views

Image inpainting consists in filling missing regions of an image by inferring from the surrounding content.
In the case of texture images, inpainting can be formulated in terms of conditional simulation of a stochastic texture model.
Many texture synthesis methods thus have been adapted to texture inpainting, but these methods do not offer theoretical guarantees since the conditional sampling is in general only approximate.

Categories:
5 Views

Previous works on actor identification mainly focused on static
features based on face identification and costume detection,
without considering the abundant dynamic information contained
in videos. In this paper, we propose a novel method
to mine representative actions of each actor, and show the remarkable
power of such actions for actor identification task.
Videos are firstly divided into shots and represented by BoW
based on spatial-temporal features. Then we integrate the prototype

Categories:
2 Views

A latent style model describing manga styles based on the proposed manga-specific features is constructed to facilitate novel style-based applications. Two manga-specific features, i.e., screentone features showing texture and shade, and panel features showing panel arrangement, are firstly proposed to describe manga pages. Based on the latent Dirichlet allocation technique, we discover latent style elements embedded in manga documents, which are described by visual words derived from manga-specific features.

Categories:
8 Views

Pages