ODTrack: Online Dense Temporal Token Learning for Visual Tracking

About

Online contextual reasoning and association across consecutive video frames are critical to perceive instances in visual tracking. However, most current top-performing trackers persistently lean on sparse temporal relationships between reference and search frames via an offline mode. Consequently, they can only interact independently within each image-pair and establish limited temporal correlations. To alleviate the above problem, we propose a simple, flexible and effective video-level tracking pipeline, named \textbf{ODTrack}, which densely associates the contextual relationships of video frames in an online token propagation manner. ODTrack receives video frames of arbitrary length to capture the spatio-temporal trajectory relationships of an instance, and compresses the discrimination features (localization information) of a target into a token sequence to achieve frame-to-frame association. This new solution brings the following benefits: 1) the purified token sequences can serve as prompts for the inference in the next video frame, whereby past information is leveraged to guide future inference; 2) the complex online update strategies are effectively avoided by the iterative propagation of token sequences, and thus we can achieve more efficient model representation and computation. ODTrack achieves a new \textit{SOTA} performance on seven benchmarks, while running at real-time speed. Code and models are available at \url{https://github.com/GXNU-ZhongLab/ODTrack}.

Yaozong Zheng, Bineng Zhong, Qihua Liang, Zhiyi Mo, Shengping Zhang, Xianxian Li• 2024

Related benchmarks

Task	Dataset	Result
Visual Object Tracking	TrackingNet (test)	Normalized Precision (Pnorm)91	502
Object Tracking	LaSoT	AUC74	498
Visual Object Tracking	LaSOT (test)	AUC74	470
Visual Object Tracking	GOT-10k (test)	Average Overlap78.2	450
Object Tracking	TrackingNet	Precision (P)86.7	327
Visual Object Tracking	GOT-10k	AO78.2	306
Visual Object Tracking	TNL2K	AUC61.7	169
Visual Object Tracking	OTB-100	AUC72.4	154
Visual Object Tracking	VOT 2020 (test)	EAO0.605	147
Visual Object Tracking	LaSoText	AUC53.9	140

Showing 10 of 51 rows

Other info

Code

Follow for update

@wizwand_team Discord