Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

CoTracker3: Simpler and Better Point Tracking by Pseudo-Labelling Real Videos

About

Most state-of-the-art point trackers are trained on synthetic data due to the difficulty of annotating real videos for this task. However, this can result in suboptimal performance due to the statistical gap between synthetic and real videos. In order to understand these issues better, we introduce CoTracker3, comprising a new tracking model and a new semi-supervised training recipe. This allows real videos without annotations to be used during training by generating pseudo-labels using off-the-shelf teachers. The new model eliminates or simplifies components from previous trackers, resulting in a simpler and often smaller architecture. This training scheme is much simpler than prior work and achieves better results using 1,000 times less data. We further study the scaling behaviour to understand the impact of using more real unsupervised data in point tracking. The model is available in online and offline variants and reliably tracks visible and occluded points.

Nikita Karaev, Iurii Makarov, Jianyuan Wang, Natalia Neverova, Andrea Vedaldi, Christian Rupprecht• 2024

Related benchmarks

TaskDatasetResultRank
Point TrackingDAVIS TAP-Vid
Average Jaccard (AJ)64.8
41
Point TrackingDAVIS
AJ63.8
38
Point TrackingTAP-Vid Kinetics
Overall Accuracy89.43
37
Point TrackingTAP-Vid-Kinetics (val)
Average Displacement Error67.8
25
Point TrackingKinetics
delta_avg68.5
24
3D Point TrackingTAPVid-3D DriveTrack (minival)
3D AJ Score13.6
19
3D Point TrackingTAPVid-3D Average (minival)
3D AJ0.135
19
3D Point TrackingTAPVid-3D Aria (minival)
3D-AJ16.8
19
3D Point TrackingTAPVid-3D PStudio (minival)
3D-AJ10.1
19
3D Point TrackingTAPVid-3D ADT 1.0 (test)
APD3D12.3
15
Showing 10 of 27 rows

Other info

Follow for update