AirSplat: Alignment and Rating for Robust Feed-Forward 3D Gaussian Splatting
About
While 3D Vision Foundation Models (3DVFMs) have demonstrated remarkable zero-shot capabilities in visual geometry estimation, their direct application to generalizable novel view synthesis (NVS) remains challenging. In this paper, we propose AirSplat, a novel training framework that effectively adapts the robust geometric priors of 3DVFMs into high-fidelity, pose-free NVS. Our approach introduces two key technical contributions: (1) Self-Consistent Pose Alignment (SCPA), a training-time feedback loop that ensures pixel-aligned supervision to resolve pose-geometry discrepancy; and (2) Rating-based Opacity Matching (ROM), which leverages the local 3D geometry consistency knowledge from a sparse-view NVS teacher model to filter out degraded primitives. Experimental results on large-scale benchmarks demonstrate that our method significantly outperforms state-of-the-art pose-free NVS approaches in reconstruction quality. Our AirSplat highlights the potential of adapting 3DVFMs to enable simultaneous visual geometry estimation and high-quality view synthesis.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Novel View Synthesis | RE10K | SSIM81.5 | 142 | |
| Novel View Synthesis | ACID 20 (test) | PSNR26.21 | 24 | |
| Novel View Synthesis | DL3DV 12 views | PSNR22.5 | 20 | |
| Novel View Synthesis | ACID 16-view (test) | PSNR25.96 | 19 | |
| Novel View Synthesis | DL3DV 24 views | PSNR22.22 | 19 | |
| Novel View Synthesis | ACID 24 Views | PSNR26.42 | 10 | |
| Novel View Synthesis | DL3DV 36 views | PSNR22.07 | 7 |