Sorry, you need to enable JavaScript to visit this website.

Immersive Optical-See-Through Augmented Reality. Augmented Reality has been getting ready for the last 20 years, and is finally becoming real, powered by progress in enabling technologies such as graphics, vision, sensors, and displays. In this talk I’ll provide a personal retrospective on my journey, working on all those enablers, getting ready for the coming AR revolution. At Meta, we are working on immersive optical-see-through AR headset, as well as the full software stack. We’ll discuss the differences of optical vs.

Categories:
139 Views

We address the challenge of local feature matching under large scale and rotation changes by focusing on keypoint positions.
First, we propose a novel module called similarity normalization (SN).
This module normalizes keypoint positions to remove a translation, rotation and scale difference between an image pair.
By performing positional encoding on these normalized positions, a network incorporated with SN can effectively avoid encoding largly different positions into descriptors from the two images.

Categories:
8 Views

In this work, we propose a retrieval-based method for improving open vocabulary panoptic segmentation.

Categories:
4 Views

Although many deepfake detection methods have been proposed to fight against severe misuse of generative AI, none provide detailed human-interpretable explanations beyond simple real/fake responses. This limitation makes it challenging for humans to assess the accuracy of detection results, especially when the models encounter unseen deepfakes. To address this issue, we propose a novel deepfake detector based on a large Vision-Language Model (VLM), capable of explaining manipulated facial regions.

Categories:
9 Views

Pre-trained large foundation models play a central role in the recent surge of artificial intelligence, resulting in fine-tuned models with remarkable abilities when measured on benchmark datasets, standard exams, and applications. Due to their inherent complexity, these models are not well understood; in particular, the structures of the representation space are not well characterized despite their fundamental importance. In this paper,

Categories:
6 Views

Copy Detection system aims to identify if a query image is an edited/manipulated copy of an image from a large reference database with millions of images. While global image descriptors can retrieve visually similar images, they struggle to differentiate near-duplicates from semantically similar instances. We propose a dual-triplet metric learning (DTML) technique to learn global image features that group near-duplicates closer than visually similar images while maintaining the semantic structure of the embedding space.

Categories:
4 Views

This 2-page document provides supplementary material for the paper titled "Perceptual Classifiers for Detecting Generative Images". It provides details on the datasets used and their composition. We also include the real and fake detection accuracies for each class to help readers better understand the strengths and drawbacks of the proposed approach. Finally, we provide t-SNE visualizations to understand the effectiveness of the chosen feature extractors.

Categories:
33 Views

Pages