Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Gaussian Variation Field Diffusion for High-fidelity Video-to-4D Synthesis

About

In this paper, we present a novel framework for video-to-4D generation that creates high-quality dynamic 3D content from single video inputs. Direct 4D diffusion modeling is extremely challenging due to costly data construction and the high-dimensional nature of jointly representing 3D shape, appearance, and motion. We address these challenges by introducing a Direct 4DMesh-to-GS Variation Field VAE that directly encodes canonical Gaussian Splats (GS) and their temporal variations from 3D animation data without per-instance fitting, and compresses high-dimensional animations into a compact latent space. Building upon this efficient representation, we train a Gaussian Variation Field diffusion model with temporal-aware Diffusion Transformer conditioned on input videos and canonical GS. Trained on carefully-curated animatable 3D objects from the Objaverse dataset, our model demonstrates superior generation quality compared to existing methods. It also exhibits remarkable generalization to in-the-wild video inputs despite being trained exclusively on synthetic data, paving the way for generating high-quality animated 3D content. Project page: https://gvfdiffusion.github.io/.

Bowen Zhang, Sicheng Xu, Chuxin Wang, Jiaolong Yang, Feng Zhao, Dong Chen, Baining Guo• 2025

Related benchmarks

TaskDatasetResultRank
4D Mesh ReconstructionObjaverse (test)
CD0.1157
13
4D SynthesisMonocular Video
FPS0.8
8
4D mesh generationTruebones Zoo (test)
CD0.1406
6
3D ReconstructionObjaverse Diffusion4D curated 1.0 (test)
P2S0.0345
5
4D Motion ModelingMotion-80 Short Sequence
CD0.197
5
video-to-4D object generationvideo-to-4D object generation (test)
CLIP Score0.931
5
4D Object ReconstructionDeformingThings (test)
CD0.2806
5
Novel View SynthesisObjaverse
PSNR17.31
5
3D Motion Generation20 static meshes (test)
OC0.167
4
Text-to-motion generationBIMO
Text-to-Motion Agreement (TA)2.343
4
Showing 10 of 12 rows

Other info

Follow for update