VISUAL ADAPT FOR RGBD TRACKING

Recent RGBD trackers have employed cueing techniques by overlaying Depth modality images as cues onto RGB modality images, which are then fed into the RGB-based model for tracking. However, the direct overlaying interaction method between modalities not only introduces more noise into the feature space but also exhibits the inadaptability of the RGB-based model to mixed-modality inputs. To address these issues, we introduce Visual Adapt for RGBD Tracking (VADT). Specifically, we maintain the input of the RGB-based model as the RGB modality. Additionally, we have devised a fusion module to enable modality interaction between depth and RGB features. Subsequently, a Depth Adapt module has been formulated to facilitate image interaction with the fused features. This module involves cross-attending to the obtained depth-assisted features and the RGB search frame features produced by the RGB-based model’s output. Experimental results indicate that our proposed tracker achieves state-of-the-art results on various RGBD benchmark tests.

VISUAL ADAPT FOR RGBD TRACKING.pptx

VISUAL ADAPT FOR RGBD TRACKING.pptx (111)

Thumbs Up

CITE

Documents

Presentation Slides

VISUAL ADAPT FOR RGBD TRACKING

VISUAL ADAPT FOR RGBD TRACKING.pptx

QUESTIONS?