Documents
Presentation Slides
VISUAL ADAPT FOR RGBD TRACKING
- Citation Author(s):
- Submitted by:
- GuangTong Zhang
- Last updated:
- 31 March 2024 - 5:21am
- Document Type:
- Presentation Slides
- Categories:
- Keywords:
- Log in to post comments
Recent RGBD trackers have employed cueing techniques by overlaying Depth modality images as cues onto RGB modality images, which are then fed into the RGB-based model for tracking. However, the direct overlaying interaction method between modalities not only introduces more noise into the feature space but also exhibits the inadaptability of the RGB-based model to mixed-modality inputs. To address these issues, we introduce Visual Adapt for RGBD Tracking (VADT). Specifically, we maintain the input of the RGB-based model as the RGB modality. Additionally, we have devised a fusion module to enable modality interaction between depth and RGB features. Subsequently, a Depth Adapt module has been formulated to facilitate image interaction with the fused features. This module involves cross-attending to the obtained depth-assisted features and the RGB search frame features produced by the RGB-based model’s output. Experimental results indicate that our proposed tracker achieves state-of-the-art results on various RGBD benchmark tests.