Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Bi-directional Adapter for Multi-modal Tracking

About

Due to the rapid development of computer vision, single-modal (RGB) object tracking has made significant progress in recent years. Considering the limitation of single imaging sensor, multi-modal images (RGB, Infrared, etc.) are introduced to compensate for this deficiency for all-weather object tracking in complex environments. However, as acquiring sufficient multi-modal tracking data is hard while the dominant modality changes with the open environment, most existing techniques fail to extract multi-modal complementary information dynamically, yielding unsatisfactory tracking performance. To handle this problem, we propose a novel multi-modal visual prompt tracking model based on a universal bi-directional adapter, cross-prompting multiple modalities mutually. Our model consists of a universal bi-directional adapter and multiple modality-specific transformer encoder branches with sharing parameters. The encoders extract features of each modality separately by using a frozen pre-trained foundation model. We develop a simple but effective light feature adapter to transfer modality-specific information from one modality to another, performing visual feature prompt fusion in an adaptive manner. With adding fewer (0.32M) trainable parameters, our model achieves superior tracking performance in comparison with both the full fine-tuning methods and the prompt learning-based methods. Our code is available: https://github.com/SparkTempest/BAT.

Bing Cao, Junliang Guo, Pengfei Zhu, Qinghua Hu• 2023

Related benchmarks

TaskDatasetResultRank
RGB-T TrackingLasHeR (test)
PR70.2
257
RGB-T TrackingRGBT234 (test)
Precision Rate86.8
203
RGB-T TrackingGTOT
PR90.9
138
RGB-T TrackingRGBT234
Precision86.8
121
RGBT TrackingLasHeR
PR70.2
120
RGBT TrackingRGBT234
PR86.8
112
RGBT TrackingLasHeR
PR70.2
62
RGBT TrackingRGBT 234
Precision Rate86.8
53
Object TrackingRGBT234
MSR64.1
21
Multi-modal Object TrackingLasHeR
Precision (Pr)70.2
19
Showing 10 of 15 rows

Other info

Code

Follow for update