Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Autoregressive Queries for Adaptive Tracking with Spatio-TemporalTransformers

About

The rich spatio-temporal information is crucial to capture the complicated target appearance variations in visual tracking. However, most top-performing tracking algorithms rely on many hand-crafted components for spatio-temporal information aggregation. Consequently, the spatio-temporal information is far away from being fully explored. To alleviate this issue, we propose an adaptive tracker with spatio-temporal transformers (named AQATrack), which adopts simple autoregressive queries to effectively learn spatio-temporal information without many hand-designed components. Firstly, we introduce a set of learnable and autoregressive queries to capture the instantaneous target appearance changes in a sliding window fashion. Then, we design a novel attention mechanism for the interaction of existing queries to generate a new query in current frame. Finally, based on the initial target template and learnt autoregressive queries, a spatio-temporal information fusion module (STM) is designed for spatiotemporal formation aggregation to locate a target object. Benefiting from the STM, we can effectively combine the static appearance and instantaneous changes to guide robust tracking. Extensive experiments show that our method significantly improves the tracker's performance on six popular tracking benchmarks: LaSOT, LaSOText, TrackingNet, GOT-10k, TNL2K, and UAV123.

Jinxia Xie, Bineng Zhong, Zhiyi Mo, Shengping Zhang, Liangtao Shi, Shuxiang Song, Rongrong Ji• 2024

Related benchmarks

TaskDatasetResultRank
Visual Object TrackingTrackingNet (test)
Normalized Precision (Pnorm)89.3
460
Visual Object TrackingLaSOT (test)
AUC72.7
444
Visual Object TrackingGOT-10k (test)
Average Overlap76
378
Object TrackingLaSoT
AUC71.4
333
Object TrackingTrackingNet
Precision (P)83.1
225
Visual Object TrackingGOT-10k
AO73.8
223
Visual Object TrackingUAV123 (test)
AUC71.2
188
Visual Object TrackingLaSOText (test)
AUC52.7
85
Visual Object TrackingTNL2k (test)
AUC59.3
74
Visual Object TrackingGOT-10k 1.0 (test)
AO76
51
Showing 10 of 19 rows

Other info

Follow for update