Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Phys4D: Fine-Grained Physics-Consistent 4D Modeling from Video Diffusion

About

Recent video diffusion models have achieved impressive capabilities as large-scale generative world models. However, these models often struggle with fine-grained physical consistency, exhibiting physically implausible dynamics over time. In this work, we present \textbf{Phys4D}, a pipeline for learning physics-consistent 4D world representations from video diffusion models. Phys4D adopts \textbf{a three-stage training paradigm} that progressively lifts appearance-driven video diffusion models into physics-consistent 4D world representations. We first bootstrap robust geometry and motion representations through large-scale pseudo-supervised pretraining, establishing a foundation for 4D scene modeling. We then perform physics-grounded supervised fine-tuning using simulation-generated data, enforcing temporally consistent 4D dynamics. Finally, we apply simulation-grounded reinforcement learning to correct residual physical violations that are difficult to capture through explicit supervision. To evaluate fine-grained physical consistency beyond appearance-based metrics, we introduce a set of \textbf{4D world consistency evaluation} that probe geometric coherence, motion stability, and long-horizon physical plausibility. Experimental results demonstrate that Phys4D substantially improves fine-grained spatiotemporal and physical consistency compared to appearance-driven baselines, while maintaining strong generative performance. Our project page is available at https://sensational-brioche-7657e7.netlify.app/

Haoran Lu, Shang Wu, Jianshu Zhang, Maojiang Su, Guo Ye, Chenwei Xu, Lie Lu, Pranav Maneriker, Fan Du, Manling Li, Zhaoran Wang, Han Liu• 2026

Related benchmarks

TaskDatasetResultRank
Video-Based Physics EvaluationPhysics-IQ (test)
Score30.2
9
4D Modeling Temporal and Motion ConsistencyVideo Consistency Evaluation Set
Depth Warp L1 Error0.5141
4
4D World Modeling4D Simulation Dataset
Chamfer Distance0.4626
4
Per-frame 3D GeometryPhysics-IQ (test)
AbsRel27.11
4
Video QualityPhysics-IQ (test)
FVD124.5
4
Showing 5 of 5 rows

Other info

Follow for update