Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Transformer Tracking

About

Correlation acts as a critical role in the tracking field, especially in recent popular Siamese-based trackers. The correlation operation is a simple fusion manner to consider the similarity between the template and the search region. However, the correlation operation itself is a local linear matching process, leading to lose semantic information and fall into local optimum easily, which may be the bottleneck of designing high-accuracy tracking algorithms. Is there any better feature fusion method than correlation? To address this issue, inspired by Transformer, this work presents a novel attention-based feature fusion network, which effectively combines the template and search region features solely using attention. Specifically, the proposed method includes an ego-context augment module based on self-attention and a cross-feature augment module based on cross-attention. Finally, we present a Transformer tracking (named TransT) method based on the Siamese-like feature extraction backbone, the designed attention-based fusion mechanism, and the classification and regression head. Experiments show that our TransT achieves very promising results on six challenging datasets, especially on large-scale LaSOT, TrackingNet, and GOT-10k benchmarks. Our tracker runs at approximatively 50 fps on GPU. Code and models are available at https://github.com/chenxin-dlut/TransT.

Xin Chen, Bin Yan, Jiawen Zhu, Dong Wang, Xiaoyun Yang, Huchuan Lu• 2021

Related benchmarks

TaskDatasetResultRank
Visual Object TrackingTrackingNet (test)
Normalized Precision (Pnorm)86.8
460
Visual Object TrackingLaSOT (test)
AUC64.9
444
Visual Object TrackingGOT-10k (test)
Average Overlap72.3
378
Object TrackingLaSoT
AUC64.9
333
RGB-T TrackingLasHeR (test)
PR52.4
244
Object TrackingTrackingNet
Precision (P)80.3
225
Visual Object TrackingGOT-10k
AO76.8
223
RGB-T TrackingRGBT234 (test)
Precision Rate82.7
189
Visual Object TrackingUAV123 (test)
AUC69.1
188
Visual Object TrackingUAV123
AUC0.694
165
Showing 10 of 72 rows
...

Other info

Code

Follow for update