Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

NeoVerse: Enhancing 4D World Model with in-the-wild Monocular Videos

About

In this paper, we propose NeoVerse, a versatile 4D world model that is capable of 4D reconstruction, novel-trajectory video generation, and rich downstream applications. We first identify a common limitation of scalability in current 4D world modeling methods, caused either by expensive and specialized multi-view 4D data or by cumbersome training pre-processing. In contrast, our NeoVerse is built upon a core philosophy that makes the full pipeline scalable to diverse in-the-wild monocular videos. Specifically, NeoVerse features pose-free feed-forward 4D reconstruction, online monocular degradation pattern simulation, and other well-aligned techniques. These designs empower NeoVerse with versatility and generalization to various domains. Meanwhile, NeoVerse achieves state-of-the-art performance in standard reconstruction and generation benchmarks. Our project page is available at https://neoverse-4d.github.io.

Yuxue Yang, Lue Fan, Ziqi Shi, Junran Peng, Feng Wang, Zhaoxiang Zhang• 2026

Related benchmarks

TaskDatasetResultRank
Video ReconstructionDAVIS
PSNR25.26
29
Novel View SynthesisNVIDIA
PSNR15.86
20
Novel View SynthesisADT
PSNR21.94
10
Novel View SynthesisTUM-D
PSNR15.26
10
Novel View SynthesisExoRecon (held-out frames)
PSNR (Held-out Frames)20.03
9
Dynamic ReconstructionDyCheck
PSNR11.56
8
Dynamic ReconstructionADT
PSNR32.56
7
4D Camera ControlPREBench Camera-only
Camera Rotation Error1.4736
7
Novel View GenerationVBench 100 unseen in-the-wild videos 30
Inference Time (Generation)18
6
View SynthesisN3DV Original Input Cameras
PSNR24.5
6
Showing 10 of 22 rows

Other info

GitHub

Follow for update