Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Reading Relevant Feature from Global Representation Memory for Visual Object Tracking

About

Reference features from a template or historical frames are crucial for visual object tracking. Prior works utilize all features from a fixed template or memory for visual object tracking. However, due to the dynamic nature of videos, the required reference historical information for different search regions at different time steps is also inconsistent. Therefore, using all features in the template and memory can lead to redundancy and impair tracking performance. To alleviate this issue, we propose a novel tracking paradigm, consisting of a relevance attention mechanism and a global representation memory, which can adaptively assist the search region in selecting the most relevant historical information from reference features. Specifically, the proposed relevance attention mechanism in this work differs from previous approaches in that it can dynamically choose and build the optimal global representation memory for the current frame by accessing cross-frame information globally. Moreover, it can flexibly read the relevant historical information from the constructed memory to reduce redundancy and counteract the negative effects of harmful information. Extensive experiments validate the effectiveness of the proposed method, achieving competitive performance on five challenging datasets with 71 FPS.

Xinyu Zhou, Pinxue Guo, Lingyi Hong, Jinglun Li, Wei Zhang, Weifeng Ge, Wenqiang Zhang• 2024

Related benchmarks

TaskDatasetResultRank
Visual Object TrackingTrackingNet (test)
Normalized Precision (Pnorm)89.6
460
Visual Object TrackingLaSOT (test)
AUC70.3
444
Visual Object TrackingGOT-10k (test)
Average Overlap74.1
378
Object TrackingTrackingNet
Precision (P)83.6
225
Visual Object TrackingUAV123 (test)--
188
Object TrackingGOT-10k
AO74.1
74
Object TrackingOTB (test)
Success Rate71.5
9
Showing 7 of 7 rows

Other info

Follow for update