Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

DINO_4D: Semantic-Aware 4D Reconstruction

About

In the intersection of computer vision and robotic perception, 4D reconstruction of dynamic scenes serve as the critical bridge connecting low-level geometric sensing with high-level semantic understanding. We present DINO\_4D, introducing frozen DINOv3 features as structural priors, injecting semantic awareness into the reconstruction process to effectively suppress semantic drift during dynamic tracking. Experiments on the Point Odyssey and TUM-Dynamics benchmarks demonstrate that our method maintains the linear time complexity $O(T)$ of its predecessors while significantly improving Tracking Accuracy (APD) and Reconstruction Completeness. DINO\_4D establishes a new paradigm for constructing 4D World Models that possess both geometric precision and semantic understanding.

Yiru Yang, Zhuojie Wu, Quentin Marguet, Nishant Kumar Singh, Max Schulthess• 2026

Related benchmarks

TaskDatasetResultRank
World Coordinate 3D ReconstructionTUM dynamics--
9
Reconstruction ErrorTUM dynamics
Chamfer Distance (cm)5.11
4
3D TrackingPoint Odyssey (test)
APD@0.1m41.8
3
Showing 3 of 3 rows

Other info

Follow for update