Sorry, you need to enable JavaScript to visit this website.

We present a crowdsourcing (CS) study to examine how specific attributes probabilistically affect the selection and sequencing of images from personal photo collections. 13 image attributes are explored, including 7 people-centric properties. We first propose a novel dataset shaping technique based on Mixed Integer Linear Programming (MILP) to identify a subset of photos in which the attributes of interest are uniformly distributed and minimally correlated.

Categories:
23 Views

We introduce BAFT, a fast binary and quasi affine invariant local image feature. It combines the affine invariance of Harris Affine feature descriptors with the speed of binary descriptors such as BRISK and ORB. BAFT derives its speed and precision from sampling local image patches in a pattern that depends on the second moment matrix of the same image patch. This approach results in a fast but discriminative descriptor, especially for image pairs with large perspective changes.

Categories:
22 Views

Although many visual attention models have been proposed, very few saliency models investigated the impact of audio information. To develop audio-visual attention models, researchers need to have a ground truth of eye movements recorded while exploring complex natural scenes in different audio conditions. They also need tools to compare eye movements and gaze patterns between these different audio conditions.

Categories:
30 Views

The capability of determining the quality of target detections is important for applications using smart cameras, such as autonomous robotics and surveillance. We propose to estimate the quality of target detections by integrating the target location uncertainty over polygonal domains, which represent the fields of view of the cameras. We define a framework based on numerical integration that easily accommodates multiple models for uncertainty and fields of view.

Categories:
8 Views

In the recent years, we experienced the proliferation of sensors for retrieving depth information on a scene, such as LIDAR or RGBD sensors (Kinect). However, it is still a challenge to identify the meaning of a specific point cloud to recognize the underlying object. Here, we wonder if it is possible to define a global feature for an object that is robust to noise, sampling and occlusion. We propose a local measure based on curvature. We called it Principal Curvatures because rather than using the Gaussian curvature we keep the

Categories:
38 Views

In this paper, we propose an original methodology allowing the computation of the saliency maps for high dimensional RTI data (Reflectance Transformation Imaging). Unlike most of the classical methods, our approach aims at devising an intrinsic visual saliency of the surface, independent of the sensor (image) and the geometry of the scene (light-object-camera).

Categories:
31 Views

Unique encoding of the dynamics of facial actions has potential to provide a spontaneous facial expression recognition system. The most promising existing approaches rely on deep learning of facial actions. However, current approaches are often computationally intensive and require a great deal of memory/processing time, and typically the temporal aspect of facial actions are often ignored, despite the potential wealth of information available from the spatial dynamic movements and their temporal evolution over time from neutral state to apex state.

Categories:
6 Views

Pages