Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

GeoNVS: Geometry Grounded Video Diffusion for Novel View Synthesis

About

Novel view synthesis requires strong 3D geometric consistency and the ability to generate visually coherent images across diverse viewpoints. While recent camera-controlled video diffusion models show promising results, they often suffer from geometric distortions and limited camera controllability. To overcome these challenges, we introduce GeoNVS, a geometry-grounded novel-view synthesizer that enhances both geometric fidelity and camera controllability through explicit 3D geometric guidance. Our key innovation is the Gaussian Splat Feature Adapter (GS-Adapter), which lifts input-view diffusion features into 3D Gaussian representations, renders geometry-constrained novel-view features, and adaptively fuses them with diffusion features to correct geometrically inconsistent representations. Unlike prior methods that inject geometry at the input level, GS-Adapter operates in feature space, avoiding view-dependent color noise that degrades structural consistency. Its plug-and-play design enables zero-shot compatibility with diverse feed-forward geometry models without additional training, and can be adapted to other video diffusion backbones. Experiments across 9 scenes and 18 settings demonstrate state-of-the-art performance, achieving 11.3% and 14.9% improvements over SEVA and CameraCtrl, with up to 2x reduction in translation error and 7x in Chamfer Distance.

Minjun Kang, Inkyu Shin, Taeyeop Lee, Myungchul Kim, In So Kweon, Kuk-Jin Yoon• 2026

Related benchmarks

TaskDatasetResultRank
Novel View SynthesisRE10K
SSIM82.6
142
Novel View SynthesisT&T small-viewpoint set (O)
PSNR22.24
44
Novel View SynthesisRE10K Small
PSNR16.66
38
New View SynthesisT&T
LPIPS0.225
33
New View SynthesisLLFF (R)
SSIM0.895
32
Novel View SynthesisDL3DV S
LPIPS0.29
25
Long trajectory Novel View SynthesisMIP360
PSNR16.45
24
Long trajectory Novel View SynthesisDL3DV
PSNR19.32
24
Long trajectory Novel View SynthesisT&T
PSNR24.52
24
Novel View SynthesisDTU small-viewpoint set (R)
PSNR18.29
24
Showing 10 of 53 rows

Other info

Follow for update