Sorry, you need to enable JavaScript to visit this website.

This document provides supplementary material for the paper titled “Latent Enhancing AutoEncoder for Occluded Image Classification” submitted to the regular track of the ICIP 2024. This document consists of details of the architecture of the LEARN, illustration of improvements in inter-class differentiability in latent space for OccludedPASCAL3D+ dataset (hereafter referred to as Pascal), and detailed classification results.

Categories:
26 Views

Drones have been widely employed in various fields, but the number of drones being used illegally and for hazardous purposes has recently increased. To prevent illegal drones, in this work, we propose a novel framework for reconstructing three-dimensional (3D) drone trajectories using a single camera. By leveraging calibrated cameras, we exploit the relationship between 2D and 3D spaces. We automatically track the drones in 2D images using a drone tracker and estimate their 2D rotations.

Categories:
14 Views

Multi-class multi-instance segmentation is the task of identifying masks for multiple object classes and multiple instances of the same class within an image. The Segment Anything Model (SAM) is a new foundation model designed for promptable multi-class multi-instance segmentation. SAM is able to segment objects in any image using a pre-defined point grid as an input prompt in the ``everything'' mode. However, out of the box SAM tends to output part or sub-part segmentation masks (under-segmentation) in different real-world applications.

Categories:
64 Views

Identifying people’s identity from a group photo through face
recognition models has applications in various fields. There
are two major challenges, first due to the presence of several
faces with various degrees of clarity and scale, and second due
to angular orientation of faces in usual group photos. Detect-
ing and cropping the faces have been reasonably solved using
various segmentation-like models. Recognizing identity after
cropping a frontal face has also been successful to some ex-

Categories:
8 Views

Prune Channel and Distill: Discriminative Knowledge Distillation for Semantic Segmentation - Supplementary Material -

Categories:
12 Views

We consider unsupervised domain adaptation (UDA) for semantic segmentation in which the model is trained on a labeled source dataset and adapted to an unlabeled target dataset. Unfortunately, current self-training methods are susceptible to misclassified pseudo-labels resulting from erroneous predictions. Since certain classes are typically associated with less reliable predictions in UDA, reducing the impact of such pseudo-labels without skewing the training towards some classes is notoriously difficult.

Categories:
17 Views

Here are supplementary materials for the paper "When Segment Anything Model Meets Food Instance Segmentation", which includes an appendix for the paper, examples of FoodInsSeg, examples of InsSAM-Tool, and dataset documentation. These contents aim to provide readers with more details for better understanding the research contributions in this work.

Categories:
25 Views

We present a new method for learning a fine-grained representation of visual style. Representation learning aims to discover individual salient features of a domain in a compact and descriptive form that strongly identifies the unique characteristics of that domain. Prior visual style representation works attempt to disentangle style (ie appearance) from content (ie semantics) yet a complete separation has yet to be achieved. We present a technique to learn a representation of visual style more strongly disentangled from the semantic content depicted in an image.

Categories:
10 Views

Pages