Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Gaussian Variation Field Diffusion for High-fidelity Video-to-4D Synthesis

About

In this paper, we present a novel framework for video-to-4D generation that creates high-quality dynamic 3D content from single video inputs. Direct 4D diffusion modeling is extremely challenging due to costly data construction and the high-dimensional nature of jointly representing 3D shape, appearance, and motion. We address these challenges by introducing a Direct 4DMesh-to-GS Variation Field VAE that directly encodes canonical Gaussian Splats (GS) and their temporal variations from 3D animation data without per-instance fitting, and compresses high-dimensional animations into a compact latent space. Building upon this efficient representation, we train a Gaussian Variation Field diffusion model with temporal-aware Diffusion Transformer conditioned on input videos and canonical GS. Trained on carefully-curated animatable 3D objects from the Objaverse dataset, our model demonstrates superior generation quality compared to existing methods. It also exhibits remarkable generalization to in-the-wild video inputs despite being trained exclusively on synthetic data, paving the way for generating high-quality animated 3D content. Project page: https://gvfdiffusion.github.io/.

Bowen Zhang, Sicheng Xu, Chuxin Wang, Jiaolong Yang, Feng Zhao, Dong Chen, Baining Guo• 2025

Related benchmarks

TaskDatasetResultRank
4D Mesh ReconstructionObjaverse (test)
CD0.1157
13
Novel View SynthesisObjaverse
PSNR17.31
12
4D SynthesisMonocular Video
FPS0.8
8
Novel View and Pose SynthesisAiM Horse subset (test)
PSNR12.68
6
4D mesh generationTruebones Zoo (test)
CD0.1406
6
Novel View and Pose SynthesisAiM Zebra (test)
PSNR12.26
6
3D ReconstructionObjaverse Diffusion4D curated 1.0 (test)
P2S0.0345
5
4D Motion ModelingMotion-80 Short Sequence
CD0.197
5
video-to-4D object generationvideo-to-4D object generation (test)
CLIP Score0.931
5
4D Object ReconstructionDeformingThings (test)
CD0.2806
5
Showing 10 of 14 rows

Other info

Follow for update