Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

V2M4: 4D Mesh Animation Reconstruction from a Single Monocular Video

About

We present V2M4, a novel 4D reconstruction method that directly generates a usable 4D mesh animation asset from a single monocular video. Unlike existing approaches that rely on priors from multi-view image and video generation models, our method is based on native 3D mesh generation models. Naively applying 3D mesh generation models to generate a mesh for each frame in a 4D task can lead to issues such as incorrect mesh poses, misalignment of mesh appearance, and inconsistencies in mesh geometry and texture maps. To address these problems, we propose a structured workflow that includes camera search and mesh reposing, condition embedding optimization for mesh appearance refinement, pairwise mesh registration for topology consistency, and global texture map optimization for texture consistency. Our method outputs high-quality 4D animated assets that are compatible with mainstream graphics and game software. Experimental results across a variety of animation types and motion amplitudes demonstrate the generalization and effectiveness of our method. Project page: https://windvchen.github.io/V2M4/.

Jianqi Chen, Biao Zhang, Xiangjun Tang, Peter Wonka• 2025

Related benchmarks

TaskDatasetResultRank
4D Mesh ReconstructionObjaverse (test)
CD0.1595
13
4D SynthesisMonocular Video
FPS0.1
8
4D Object ReconstructionDeformingThings (test)
CD0.1678
5
4D Motion ModelingMotion-80 Short Sequence
CD0.3437
5
3D Motion Generation20 static meshes (test)
OC0.175
4
Text-to-motion generationBIMO
Text-to-Motion Agreement (TA)2.876
4
Video-to-4DObjaverse
CD (3D)0.063
4
Novel View SynthesisConsist4D (test)
LPIPS0.1611
4
4D Motion ModelingMotion-80 Long Sequence
CD0.3719
4
Showing 9 of 9 rows

Other info

Follow for update