- Read more about A Pool of Deep Models for Event Recognition
- Log in to post comments
This paper proposes a novel two-stage framework for event recognition in still images. First, for a generic event image, deep features, obtained via different pre-trained models, are fed into an ensemble of classifiers, whose posterior classification probabilities are thereafter fused by means of an order induced scheme, which penalizes the yielded scores according to their confidence in classifying the image at hand, and then averages them. Second, we combine the fusion results with a reverse matching paradigm in order to draw the final output of our proposed pipeline.
- Categories:
- Read more about A Pool of Deep Models for Event Recognition
- Log in to post comments
This paper proposes a novel two-stage framework for event recognition in still images. First, for a generic event image, deep features, obtained via different pre-trained models, are fed into an ensemble of classifiers, whose posterior classification probabilities are thereafter fused by means of an order induced scheme, which penalizes the yielded scores according to their confidence in classifying the image at hand, and then averages them. Second, we combine the fusion results with a reverse matching paradigm in order to draw the final output of our proposed pipeline.
- Categories:
- Read more about Learning a Cross-Modal Hashing Network for Multimedia Search
- Log in to post comments
In this paper, we propose a cross-modal hashing network (CMHN) method to learn compact binary codes for cross-modality multimedia search. Unlike most existing cross-modal hashing methods which learn a single pair of projections to map each example into a binary vector, we design a deep neural network to learn multiple pairs of hierarchical non-linear transformations, under which the nonlinear characteristics of samples can be well exploited and the modality gap is well reduced.
- Categories:
- Read more about SELF-PACED LEAST SQUARE SEMI-COUPLED DICTIONARY LEARNING FOR PERSON RE-IDENTIFICATION
- Log in to post comments
Person re-identification aims to match people across disjoint camera views. It has been reported that Least Square
Semi-Coupled Dictionary Learning (LSSCDL) based samplespecific SVM learning framework has obtained the state of
the art performance. However, the objective function of the LSSCDL, the algorithm of learning the pairs (feature, weight)
dictionaries and the mapping function between feature space and weight space, is non-convex, which usually result in
- Categories:
- Read more about LEVEL-SET FORMULATION BASED ON OTSU METHOD WITH MORPHOLOGICAL REGULARIZATION
- Log in to post comments
Noisy image segmentation is one of the most important and challenging problem in computer vision. In this paper, we propose a level set segmentation technique inspired by the classic Otsu thresholding method. The front propagation of the proposed level set based method embeds a cost function that takes into account first-order statistical moments. In order to deal with highly noisy images, we also added a morphological step to our algorithm which led the final segmentation more robust and efficient.
- Categories:
- Read more about Part Based Fine-grained Bird Image Retrieval Respecting Species Correlation
- Log in to post comments
poster.pdf
- Categories:
- Read more about EXTRACTING KEY FRAMES FROM FIRST-PERSON VIDEOS IN THE COMMON SPACE OF MULTIPLE SENSORS
- Log in to post comments
Selecting authentic scenes about activities of daily living (ADL) is useful to support our memory of everyday life. Key-frame extraction for first-person vision (FPV) videos is a core technology to realize such memory assistant. However, most existing key-frame extraction methods have mainly focused on stable scenes not related to ADL and only used visual signals of the image sequence even though the activities usually associate with our visual experience. To deal with dynamically changing scenes of FPV about daily activities, integrating motion and visual signals are essential.
- Categories:
This work addresses image recovery problem in the presence of salt-and-pepper noise and image blur. The salt-and-pepper noise reviewed as the impulsive noise, in this paper, is modeled as a sparse signal because of its impulsiveness. To accurately reconstruct the clean image and the blur kernel, the framelet domains are exploited to sparsely represent the image and the blur kernel. From the reformulations conducted, a joint estimation is devised to simultaneously perform the image recovery, the salt-and-pepper noise suppression and the blur kernel estimation under a optimization framework.
- Categories:
- Read more about How should we evaluate supervised hashing?
- Log in to post comments
- Categories:
- Read more about Summarization of Human Activity Videos Via Low-Rank Approximation
- Log in to post comments
Summarization of videos depicting human activities is a timely problem with important applications, e.g., in the domains of surveillance or film/TV production, that steadily becomes more relevant. Research on video summarization has mainly relied on global clustering or local (frame-by-frame) saliency methods to provide automated algorithmic
- Categories: