Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Efficiently Reconstructing Dynamic Scenes One D4RT at a Time

About

Understanding and reconstructing the complex geometry and motion of dynamic scenes from video remains a formidable challenge in computer vision. This paper introduces D4RT, a simple yet powerful feedforward model designed to efficiently solve this task. D4RT utilizes a unified transformer architecture to jointly infer depth, spatio-temporal correspondence, and full camera parameters from a single video. Its core innovation is a novel querying mechanism that sidesteps the heavy computation of dense, per-frame decoding and the complexity of managing multiple, task-specific decoders. Our decoding interface allows the model to independently and flexibly probe the 3D position of any point in space and time. The result is a lightweight and highly scalable method that enables remarkably efficient training and inference. We demonstrate that our approach sets a new state of the art, outperforming previous methods across a wide spectrum of 4D reconstruction tasks. We refer to the project webpage for animated results: https://d4rt-paper.github.io/.

Chuhan Zhang, Guillaume Le Moing, Skanda Koppula, Ignacio Rocco, Liliane Momeni, Junyu Xie, Shuyang Sun, Rahul Sukthankar, Jo\"elle K. Barral, Raia Hadsell, Zoubin Ghahramani, Andrew Zisserman, Junlin Zhang, Mehdi S. M. Sajjadi• 2025

Related benchmarks

TaskDatasetResultRank
Camera pose estimationSintel 14-sequence
ATE0.065
15
Camera Coordinate 3D trackingTAPVid-3D ADT (test)
AJ0.307
9
Camera Coordinate 3D trackingTAPVid-3D PStudio (test)
AJ0.372
9
Camera Coordinate 3D trackingTAPVid-3D DriveTrack (test)
AJ0.257
9
World Coordinate 3D trackingTAPVid-3D DriveTrack (test)
APD3D0.47
7
World Coordinate 3D trackingTAPVid-3D ADT (test)
APD3D0.319
7
3D Point Cloud ReconstructionMPI Sintel
L1 Error0.768
6
3D Point Cloud ReconstructionScanNet
L1 Error0.028
6
Camera pose estimationScanNet static indoor scenes
ATE0.014
6
Camera pose estimationRe10K static indoor scenes
Pose AUC83.5
6
Showing 10 of 14 rows

Other info

Follow for update