Sorry, you need to enable JavaScript to visit this website.

Binary hashing is a practical approach for fast, approximate retrieval in large image databases. The goal is to learn a hash function that maps images onto a binary vector such that Hamming distances approximate semantic similarities. The search is then fast by using hardware support for binary operations. Most hashing papers define a complicated objective function that couples the single-bit hash functions.

Categories:
2 Views

We introduce a novel approach to improve unsupervised hashing. Specifically, we propose a very efficient embedding method: Gaussian Mixture Model embedding (Gemb). The proposed method, using Gaussian Mixture Model, embeds feature vector into a low-dimensional vector and, simultaneously, enhances the discriminative property of features before passing them into hashing. Our experiment shows that the proposed method boosts the hashing performance of many state-of-the-art, e.g.

Categories:
1 Views

We introduce a novel approach to improve unsupervised hashing. Specifically, we propose a very efficient embedding method: Gaussian Mixture Model embedding (Gemb). The proposed method, using Gaussian Mixture Model, embeds feature vector into a low-dimensional vector and, simultaneously, enhances the discriminative property of features before passing them into hashing. Our experiment shows that the proposed method boosts the hashing performance of many state-of-the-art, e.g.

Categories:
4 Views

Matrix factorization based hashing has been very effective in addressing the cross-modal retrieval task. In this work, we propose a novel supervised hashing approach utilizing the concepts of matrix factorization which can seamlessly incorporate the label information. In the proposed approach, the latent factors for each individual modality are generated, which are then converted to the more discriminative label space using modality specific linear transformations.

Categories:
85 Views

Video summarization has become more prominent during the last decade, due to the massive amount of available digital video content. A video summarization algorithm is typically fed an input video and expected to extract a set of important key-frames which represent the entire content, convey semantic meaning and are significantly more concise than the original input. The most wide-spread approach relies on video frame clustering and extraction of the frames closest to the cluster centroids as key-frames.

Categories:
10 Views

Video summarization is a timely and rapidly developing research field with broad commercial interest, due to the increasing availability of massive video data. Relevant algorithms face the challenge of needing to achieve a careful balance between summary compactness, enjoyability and content coverage. The specific case of stereoscopic $3$D theatrical films has become more important over the past years, but not received corresponding research attention.

Categories:
1 Views

The multiple types of social media data have abundant information, but learning multi-modal social data is challenging due to data heterogeneity and noise in user-generated data. To address this problem, we propose a multi-view network-based clustering approach that is robust to noise and fully reflects the underlying structure of the comprehensive network. To demonstrate the proposed approach, we experimented with clustering challenging tagged images of landmarks.

Categories:
5 Views

For 3D object detection and pose estimation, it is crucial to extract distinctive and representative features of the objects and describe them efficiently. Therefore, a large number of 3D feature descriptors has been developed. Among these, Point Feature Histogram RGB (PFHRGB) has been evaluated as showing the best performance for 3D object and category recognition. However, this descriptor is vulnerable to point density variation and produces many false correspondences accordingly.

Categories:
14 Views

This paper proposes a novel two-stage framework for event recognition in still images. First, for a generic event image, deep features, obtained via different pre-trained models, are fed into an ensemble of classifiers, whose posterior classification probabilities are thereafter fused by means of an order induced scheme, which penalizes the yielded scores according to their confidence in classifying the image at hand, and then averages them. Second, we combine the fusion results with a reverse matching paradigm in order to draw the final output of our proposed pipeline.

Categories:
1 Views

Pages