Sorry, you need to enable JavaScript to visit this website.

This paper presents an efficient convolutional neural net- work (CNN)-based multiple path search (MPS) algorithm to detect multiple spatial-temporal action tubes in videos. With the pass information and the accumulated scores generated by forward message passing, the new algorithm reuses these information to simultaneously find multiple paths in back- ward path tracing without repeating the search process. More- over, to rectify the potentially inaccurate bounding boxes, we also propose a video localization refinement scheme to further boost the detection accuracy.

Categories:
14 Views

In recent years, text recognition has achieved remarkable success in recognizing scanned document text. However, word recognition in natural images is still an open problem, which generally requires time consuming post-processing steps. We present a novel architecture for individual word detection in scene images based on semantic segmentation. Our contributions are twofold: the concept of WordFence, which detects border areas surrounding each individual word and a novel pixelwise weighted softmax loss function which penalizes background and emphasizes small text regions.

Categories:
9 Views

High-dynamic-range imaging (HDRI) techniques are proposed to extend the dynamic range of captured images against
sensor limitation. The key issue of multi-exposure fusion in HDRI is removing ghost artifacts caused by the motion of moving objects and handheld cameras. This paper proposes a ghost-free HDRI algorithm based on visual salience and
stack extension. To improve the accuracy of ghost areas detection, visual salience based bilateral motion detection is

Categories:
77 Views

Classical approaches for estimating optical flow have achieved rapid progress in the last decade. However, most of them are too slow to be applied in real-time video analysis. Due to the great success of deep learning, recent work has focused on using CNNs to solve such dense prediction problems. In this paper, we investigate a new deep architecture, Densely Connected Convolutional Networks (DenseNet), to learn optical flow. This specific architecture is ideal for the problem at hand as it provides shortcut connections throughout the network, which leads to implicit deep supervision.

Categories:
15 Views

We target at solving the problem of automatic image segmentation. Although 1D contour and 2D surface cues have been widely utilized in existing work, 3D depth information of an image, a necessary cue according to human visual perception, is however overlooked in automatic image segmentation. In this paper, we study how to fully utilize 1D contour, 2D surface, and 3D depth cues for image segmentation. First, three elementary segmentation modules are developed for these cues respectively.

Categories:
13 Views

We target at solving the problem of automatic image segmentation. Although 1D contour and 2D surface cues have been widely utilized in existing work, 3D depth information of an image, a necessary cue according to human visual perception, is however overlooked in automatic image segmentation. In this paper, we study how to fully utilize 1D contour, 2D surface, and 3D depth cues for image segmentation. First, three elementary segmentation modules are developed for these cues respectively.

Categories:
32 Views

Pages