Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Particle Video Revisited: Tracking Through Occlusions Using Point Trajectories

About

Tracking pixels in videos is typically studied as an optical flow estimation problem, where every pixel is described with a displacement vector that locates it in the next frame. Even though wider temporal context is freely available, prior efforts to take this into account have yielded only small gains over 2-frame methods. In this paper, we revisit Sand and Teller's "particle video" approach, and study pixel tracking as a long-range motion estimation problem, where every pixel is described with a trajectory that locates it in multiple future frames. We re-build this classic approach using components that drive the current state-of-the-art in flow and object tracking, such as dense cost maps, iterative optimization, and learned appearance updates. We train our models using long-range amodal point trajectories mined from existing optical flow data that we synthetically augment with multi-frame occlusions. We test our approach in trajectory estimation benchmarks and in keypoint label propagation tasks, and compare favorably against state-of-the-art optical flow and feature tracking methods.

Adam W. Harley, Zhaoyuan Fang, Katerina Fragkiadaki• 2022

Related benchmarks

TaskDatasetResultRank
Point TrackingDAVIS TAP-Vid
Average Jaccard (AJ)42
41
Point TrackingDAVIS
AJ42
38
Point TrackingTAP-Vid Kinetics
Overall Accuracy77.1
37
Point TrackingTAP-Vid RGB-Stacking (test)
AJ15.7
32
Point TrackingTAP-Vid DAVIS (test)
AJ42
31
Point TrackingTAP-Vid Kinetics (test)
Average Jitter (AJ)35.3
30
Point TrackingKinetics
delta_avg54.8
24
Point TrackingTAP-Vid DAVIS (First)
Delta Avg (<c)64.8
19
Point TrackingDAVIS TAP-Vid (val)
AJ42
19
Point TrackingKubric
AJ59.1
18
Showing 10 of 27 rows

Other info

Follow for update