Sorry, you need to enable JavaScript to visit this website.


Citation Author(s):
Guangtong Zhang,Qihua Liang,Zhiyi Mo,Ning Li,Bineng Zhong
Submitted by:
GuangTong Zhang
Last updated:
31 March 2024 - 5:21am
Document Type:
Presentation Slides

Recent RGBD trackers have employed cueing techniques by overlaying Depth modality images as cues onto RGB modality images, which are then fed into the RGB-based model for tracking. However, the direct overlaying interaction method between modalities not only introduces more noise into the feature space but also exhibits the inadaptability of the RGB-based model to mixed-modality inputs. To address these issues, we introduce Visual Adapt for RGBD Tracking (VADT). Specifically, we maintain the input of the RGB-based model as the RGB modality. Additionally, we have devised a fusion module to enable modality interaction between depth and RGB features. Subsequently, a Depth Adapt module has been formulated to facilitate image interaction with the fused features. This module involves cross-attending to the obtained depth-assisted features and the RGB search frame features produced by the RGB-based model’s output. Experimental results indicate that our proposed tracker achieves state-of-the-art results on various RGBD benchmark tests.

0 users have voted: