Sorry, you need to enable JavaScript to visit this website.

In this paper, we propose a new approach for searching action proposals in unconstrained videos. Our method first produces snippet action proposals by combining state-of-the-art YOLO detector (Static YOLO) and our regression based RNN detector (Recurrent YOLO). Then, these short action proposals are integrated to form final action proposals by solving two-pass dynamic programming which maximizes actioness score and temporal smoothness concurrently.

Categories:
4 Views

A novel deep convolution neural network, named as Foveated Neural Network (FNN), is proposed to predict gaze on current frames in egocentric videos. The retina-like visual inputs from the region of interest on the previous frame get analysed and encoded. The fusion of the hidden representation of the previous frame and the feature maps of the current frame guides the gaze prediction process on the current frame. In order to simulate motions, we also include the dense optical flow between these adjacent frames as additional inputs to FNN.

Categories:
8 Views

Despite recent attempts for solving the person re-identification problem, it remains a challenging task since a person’s appearance can vary significantly when large variations in view angle, human pose and illumination are involved. The concept of attention is one of the most interesting recent architectural innovations in neural networks. Inspired by that, in this paper we propose a novel approach based on using a gradient-based attention mechanism in deep convolution neural network for solving the person re-identification problem.

Categories:
9 Views

Fully connected multi layer neural networks such as Deep Boltzmann Machines (DBM) performs better than fully connected single layer neural networks in image classification tasks and has a smaller number of hidden layer neurons than Extreme Learning Machine (ELM) based fully connected multi layer neural networks such as Multi Layer ELM (ML-ELM) and Hierarchical ELM (H-ELM) However, ML-ELM and H-ELM has a smaller training time than DBM.

Categories:
39 Views

A residual-networks family with hundreds or even thousands of layers dominates major image recognition tasks, but building a network by simply stacking residual blocks inevitably limits its optimization ability. This paper proposes a novel residual-network architecture, Residual networks of Residual networks (RoR), to dig the optimization ability of residual networks. RoR substitutes optimizing residual mapping of residual mapping for optimizing original residual mapping.

Categories:
8 Views

In this paper, we introduce an adaptive unsupervised learning framework, which utilizes natural images to train filter sets. The ap- plicability of these filter sets is demonstrated by evaluating their per- formance in two contrasting applications - image quality assessment and texture retrieval. While assessing image quality, the filters need to capture perceptual differences based on dissimilarities between a reference image and its distorted version. In texture retrieval, the filters need to assess similarity between texture images to retrieve closest matching textures.

Categories:
11 Views

Pages