pixelSplat: 3D Gaussian Splats from Image Pairs for Scalable Generalizable 3D Reconstruction
About
We introduce pixelSplat, a feed-forward model that learns to reconstruct 3D radiance fields parameterized by 3D Gaussian primitives from pairs of images. Our model features real-time and memory-efficient rendering for scalable training as well as fast 3D reconstruction at inference time. To overcome local minima inherent to sparse and locally supported representations, we predict a dense probability distribution over 3D and sample Gaussian means from that probability distribution. We make this sampling operation differentiable via a reparameterization trick, allowing us to back-propagate gradients through the Gaussian splatting representation. We benchmark our method on wide-baseline novel view synthesis on the real-world RealEstate10k and ACID datasets, where we outperform state-of-the-art light field transformers and accelerate rendering by 2.5 orders of magnitude while reconstructing an interpretable and editable 3D radiance field.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Novel View Synthesis | LLFF | PSNR22.99 | 124 | |
| Novel View Synthesis | RealEstate10K | PSNR26.09 | 116 | |
| Monocular Depth Estimation | NYU V2 | Delta 1 Acc0.138 | 113 | |
| Novel View Synthesis | DTU | PSNR12.89 | 100 | |
| Novel View Synthesis | DL3DV | PSNR27.201 | 61 | |
| Novel View Synthesis | Blender | PSNR15.77 | 60 | |
| Novel View Synthesis | ScanNet | PSNR19.606 | 58 | |
| Novel View Synthesis | ACID | PSNR28.284 | 51 | |
| Novel View Synthesis | Replica | PSNR26.28 | 39 | |
| Novel View Synthesis | RealEstate-10K 2-view | PSNR26.09 | 28 |