Shape of Motion: 4D Reconstruction from a Single Video
About
Monocular dynamic reconstruction is a challenging and long-standing vision problem due to the highly ill-posed nature of the task. Existing approaches depend on templates, are effective only in quasi-static scenes, or fail to model 3D motion explicitly. We introduce a method for reconstructing generic dynamic scenes, featuring explicit, persistent 3D motion trajectories in the world coordinate frame, from casually captured monocular videos. We tackle the problem with two key insights: First, we exploit the low-dimensional structure of 3D motion by representing scene motion with a compact set of SE(3) motion bases. Each point's motion is expressed as a linear combination of these bases, facilitating soft decomposition of the scene into multiple rigidly-moving groups. Second, we take advantage of off-the-shelf data-driven priors such as monocular depth maps and long-range 2D tracks, and devise a method to effectively consolidate these noisy supervisory signals, resulting in a globally consistent representation of the dynamic scene. Experiments show that our method achieves state-of-the-art performance for both long-range 3D/2D motion estimation and novel view synthesis on dynamic scenes. Project Page: https://shape-of-motion.github.io/
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Novel View Synthesis | iPhone DyCheck 7 scenes 2x resolution | mPSNR17.32 | 31 | |
| 3D human reconstruction | ZJU-MoCap (test) | PSNR26.87 | 31 | |
| Novel View Synthesis | iPhone dataset | SSIM0.65 | 23 | |
| 4D Reconstruction | DyCheck (test) | mPSNR17.32 | 21 | |
| Novel View Synthesis | DyCheck (test) | mPSNR17.96 | 15 | |
| Novel View Synthesis | Nvidia Dataset | PSNR23.29 | 15 | |
| Dynamic View Synthesis | DyCheck 5 scenes, 1x resolution 1.0 (test) | mLPIPS0.45 | 11 | |
| 3D Point Tracking | iPhone dataset | EPE0.082 | 10 | |
| 2D Point Tracking | iPhone dataset | AJ34.4 | 10 | |
| Novel View Synthesis | NVIDIA dataset (test) | Mean PSNR23.37 | 9 |