Sorry, you need to enable JavaScript to visit this website.

Learning binary representation is essential to large-scale computer vision tasks. Most existing algorithms require a separate quantization constraint to learn effective hashing functions. In this work, we present Direct Binary Embedding (DBE), a simple yet very effective algorithm to learn binary representation in an end-to-end fashion. By appending an ingeniously designed DBE layer to the deep convolutional neural network (DCNN), DBE learns binary code directly from the continuous DBE layer activation without quantization error.

Categories:
9 Views

Binary hashing is an established approach for fast, approximate image search. It maps a query image to a binary vector so that Hamming distances approximate image similarities. Applying the hash function can be made fast by using a circulant matrix and the fast Fourier transform, but this circulant hash function must be learned optimally from training data. We show that a previously proposed learning algorithm based on optimization in the frequency domain is suboptimal.

Categories:
5 Views

Compared with unsupervised hashing, supervised hashing commonly illustrates better accuracy in many real applica- tions by leveraging semantic (label) information. However, it is tough to solve the supervised hashing problem directly because it is essentially a discrete optimization problem. Some other works try to solve the discrete optimization problem directly using binary quadratic programming, but they are typically too complicated and time-consuming while some supervised hashing methods have to solve a relaxed continuous optimization problem by dropping the discrete con- straints.

Categories:
2 Views

This paper develops a novel object based graph model for semantic video comparison. The model describes a video with detected objects as nodes, and elationship between the objects as edges in a graph. We investigated several spatial and temporal features as the graph node attributes, and dierent ways to describe the spatial-temporal relationship between objects as the edge attributes. To tackle the problem of erratic camera motion on the detected object, a global motion estimation and correction approach is proposed to reveal the true object trajectory.

Categories:
14 Views

In this paper, we aim to find exactly the same shoes given a daily shoe photo (street scenario) that matches the online shop shoe photo (shop scenario). There are large visual differences between the street and shop scenario shoe images. To handle the discrepancy of different scenarios, we learn a feature embedding for shoes via a viewpoint-invariant triplet network, the feature activations of which reflect the inherent similarity between any two shoe images.

Categories:
22 Views

In this paper we aim to find exactly the same shoes given a daily shoe photo (street scenario) that matches the online shop shoe photo (shop scenario). There are large visual differences between the street and shop scenario shoe images. To handle the discrepancy of different scenarios, we learn a feature embedding for shoes via a viewpoint-invariant triplet network, the feature activations of which reflect the inherent similarity between any two shoe images.

Categories:
12 Views

Pages