Real-World Point Tracking with Verifier-Guided Pseudo-Labeling

About

Models for long-term point tracking are typically trained on large synthetic datasets. The performance of these models degrades in real-world videos due to different characteristics and the absence of dense ground-truth annotations. Self-training on unlabeled videos has been explored as a practical solution, but the quality of pseudo-labels strongly depends on the reliability of teacher models, which vary across frames and scenes. In this paper, we address the problem of real-world fine-tuning and introduce verifier, a meta-model that learns to assess the reliability of tracker predictions and guide pseudo-label generation. Given candidate trajectories from multiple pretrained trackers, the verifier evaluates them per frame and selects the most trustworthy predictions, resulting in high-quality pseudo-label trajectories. When applied for fine-tuning, verifier-guided pseudo-labeling substantially improves the quality of supervision and enables data-efficient adaptation to unlabeled videos. Extensive experiments on four real-world benchmarks demonstrate that our approach achieves state-of-the-art results while requiring less data than prior self-training methods. Project page: https://kuis-ai.github.io/track_on_r

G\"orkay Aydemir, Fatma G\"uney, Weidi Xie• 2026

Related benchmarks

Task	Dataset	Result
Point Tracking	DAVIS TAP-Vid	Average Jaccard (AJ)68.1	52
Point Tracking	TAP-Vid Kinetics	Overall Accuracy90.5	48
Point Tracking	RoboTAP	AJ70.9	22
Point Tracking	EgoPoints	Average Displacement X67.3	10
Point Tracking	Dynamic Replica	Average Displacement Error75.1	9
Point Tracking	PointOdyssey	Average Displacement Error (ADE)53.4	4

Showing 6 of 6 rows

Other info

Follow for update

@wizwand_team Discord