Sorry, you need to enable JavaScript to visit this website.

A residual-networks family with hundreds or even thousands of layers dominates major image recognition tasks, but building a network by simply stacking residual blocks inevitably limits its optimization ability. This paper proposes a novel residual-network architecture, Residual networks of Residual networks (RoR), to dig the optimization ability of residual networks. RoR substitutes optimizing residual mapping of residual mapping for optimizing original residual mapping.

Categories:
8 Views

In this paper, we introduce an adaptive unsupervised learning framework, which utilizes natural images to train filter sets. The ap- plicability of these filter sets is demonstrated by evaluating their per- formance in two contrasting applications - image quality assessment and texture retrieval. While assessing image quality, the filters need to capture perceptual differences based on dissimilarities between a reference image and its distorted version. In texture retrieval, the filters need to assess similarity between texture images to retrieve closest matching textures.

Categories:
12 Views

In this paper, we introduce an adaptive unsupervised learning framework, which utilizes natural images to train filter sets. The ap- plicability of these filter sets is demonstrated by evaluating their per- formance in two contrasting applications - image quality assessment and texture retrieval. While assessing image quality, the filters need to capture perceptual differences based on dissimilarities between a reference image and its distorted version. In texture retrieval, the filters need to assess similarity between texture images to retrieve closest matching textures.

Categories:
5 Views

In this paper, we propose a cross-modal hashing network (CMHN) method to learn compact binary codes for cross-modality multimedia search. Unlike most existing cross-modal hashing methods which learn a single pair of projections to map each example into a binary vector, we design a deep neural network to learn multiple pairs of hierarchical non-linear transformations, under which the nonlinear characteristics of samples can be well exploited and the modality gap is well reduced.

Categories:
59 Views

In this paper, we explore the redundancy in convolutional neural network, which scales with the complexity of vision tasks. Considering that many front-end visual systems are interested in only a limited range of visual targets, the removing of task-specified network redundancy can promote a wide range of potential applications. We propose a task-specified knowledge distillation algorithm to derive a simplified model with pre-set computation cost and minimized accuracy loss, which suits the resource constraint front-end systems well.

Categories:
14 Views

Automatic emotion recognition from speech is a challenging task which relies heavily on the effectiveness of the speech features used for classification. In this work, we study the use of deep learning to automatically discover emotionally relevant features from speech. It is shown that using a deep recurrent neural network, we can learn both the short-time frame-level acoustic features that are emotionally relevant, as well as an appropriate temporal aggregation of those features into a compact utterance-level representation.

Categories:
185 Views

Recurrent neural network (RNN) based character-level language models (CLMs) are extremely useful for modeling out-of-vocabulary words by nature. However, their performance is generally much worse than the word-level language models (WLMs), since CLMs need to consider longer history of tokens to properly predict the next one. We address this problem by proposing hierarchical RNN architectures, which consist of multiple modules with different timescales.

Categories:
18 Views

Pages