PointOdyssey: A Large-Scale Synthetic Dataset for Long-Term Point Tracking

About

We introduce PointOdyssey, a large-scale synthetic dataset, and data generation framework, for the training and evaluation of long-term fine-grained tracking algorithms. Our goal is to advance the state-of-the-art by placing emphasis on long videos with naturalistic motion. Toward the goal of naturalism, we animate deformable characters using real-world motion capture data, we build 3D scenes to match the motion capture environments, and we render camera viewpoints using trajectories mined via structure-from-motion on real videos. We create combinatorial diversity by randomizing character appearance, motion profiles, materials, lighting, 3D assets, and atmospheric effects. Our dataset currently includes 104 videos, averaging 2,000 frames long, with orders of magnitude more correspondence annotations than prior work. We show that existing methods can be trained from scratch in our dataset and outperform the published variants. Finally, we introduce modifications to the PIPs point tracking method, greatly widening its temporal receptive field, which improves its performance on PointOdyssey as well as on two real-world benchmarks. Our data and code are publicly available at: https://pointodyssey.com

Yang Zheng, Adam W. Harley, Bokui Shen, Gordon Wetzstein, Leonidas J. Guibas• 2023

Related benchmarks

Task	Dataset	Result
Point Tracking	TAP-Vid DAVIS (First)	Delta Avg (<c)69.1	76
Point Tracking	TAP-Vid Kinetics (First)	Avg Displacement Error (delta_avg)58.5	53
Point Tracking	DAVIS	--	38
Point Tracking	TAP-Vid DAVIS (Strided)	Avg Delta Error73.7	33
Point Tracking	TAP-Vid-Kinetics (val)	Average Displacement Error63.5	25
Point Tracking	RoboTAP	--	22
Video Tracking	BADJA	delta_seg9.8	15
Point Tracking	TAP-Vid	DAVIS Score62.5	15
Feature Tracking	EC	Feature Age (FA)82.6	14
Point Tracking	RGB-Stacking	Average Delta58.5	13

Showing 10 of 41 rows

Other info

Code

Follow for update

@wizwand_team Discord