G3Splat: Geometrically Consistent Generalizable Gaussian Splatting

About

3D Gaussians have become a powerful scene representation for real-time splatting and high-quality novel-view synthesis. This has motivated generalizable splatting -- methods that adapt feed-forward geometry prediction networks to produce per-pixel Gaussians from a set of images. However, most generalizable splatting pipelines are supervised primarily through a view-synthesis loss to predict Gaussian orientation, anisotropic scale, opacity, and appearance in addition to their locations. We show that this learning objective is under-constrained. Models trained with view synthesis alone produce splats whose orientations and scales have no geometric connotation. The result is that, while producing decent view-synthesis performance, nearly all generalizable splatting methods produce geometrically inaccurate and misaligned Gaussians. We introduce G3Splat, a geometry-consistent generalizable splatting framework that addresses these degeneracies through differentiable geometric priors on the predicted 3D Gaussians, making the learning problem well-posed. These priors encourage the per-pixel splats to remain on their viewing rays and to orient themselves in accordance with local surfaces. Our priors are architecture-agnostic and can be incorporated into any previously studied geometric backbone for generalizable splatting, as well as different scene representations. We test G3Splat with both DUSt3R-style and VGGT-style backbones to predict pixel-aligned full-rank 3DGS as well as surfel-like 2DGS. Trained on RE10K, G3Splat produces Gaussian splats with significantly higher geometric fidelity than baselines, providing state-of-the-art novel-view depth, mesh reconstruction, and relative pose estimation performance while preserving novel-view synthesis quality, as evaluated on datasets such as ACID and ScanNet. Code and pretrained models are released on our project page.

Mehdi Hosseinzadeh, Shin-Fang Chng, Yi Xu, Simon Lucey, Ian Reid, Ravi Garg• 2025

Related benchmarks

Task	Dataset	Result
Monocular Depth Estimation	NYU V2	Delta 1 Acc0.434	192
Novel View Synthesis	ACID (test)	PSNR23.827	113
Novel View Synthesis	RE10K (Medium)	PSNR23.426	57
Novel View Synthesis	RE10K (Average)	PSNR23.504	57
Pose Estimation	RE10K	AUC @ 5°0.684	41
Pose Estimation	ScanNet	AUC @ 5 deg14.8	41
Novel View Synthesis	RE10K Small	PSNR21.377	38
Novel View Synthesis	ScanNet (test)	PSNR21.168	34
Novel View Synthesis	RE10K Large	PSNR25.459	25
Pose Estimation	ACID	AUC @ 5°46.6	23

Showing 10 of 10 rows

Other info

Follow for update

@wizwand_team Discord