SelfSplat: Pose-Free and 3D Prior-Free Generalizable 3D Gaussian Splatting
About
We propose SelfSplat, a novel 3D Gaussian Splatting model designed to perform pose-free and 3D prior-free generalizable 3D reconstruction from unposed multi-view images. These settings are inherently ill-posed due to the lack of ground-truth data, learned geometric information, and the need to achieve accurate 3D reconstruction without finetuning, making it difficult for conventional methods to achieve high-quality results. Our model addresses these challenges by effectively integrating explicit 3D representations with self-supervised depth and pose estimation techniques, resulting in reciprocal improvements in both pose accuracy and 3D reconstruction quality. Furthermore, we incorporate a matching-aware pose estimation network and a depth refinement module to enhance geometry consistency across views, ensuring more accurate and stable 3D reconstructions. To present the performance of our method, we evaluated it on large-scale real-world datasets, including RealEstate10K, ACID, and DL3DV. SelfSplat achieves superior results over previous state-of-the-art methods in both appearance and geometry quality, also demonstrates strong cross-dataset generalization capabilities. Extensive ablation studies and analysis also validate the effectiveness of our proposed methods. Code and pretrained models are available at https://gynjn.github.io/selfsplat/
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Pose Estimation | ScanNet | AUC @ 5 deg3.3 | 41 | |
| Novel View Synthesis | RE10K Large | PSNR24.142 | 12 | |
| Novel View Synthesis | RE10K (Average) | PSNR19.931 | 12 | |
| Novel View Synthesis | RE10K (Medium) | PSNR19.648 | 12 | |
| Novel View Synthesis | RE10K Small | PSNR15.557 | 12 | |
| View Reconstruction | ACID 2-view (test) | PSNR26.71 | 11 | |
| View Reconstruction | RE10K 2-view (test) | PSNR24.22 | 11 | |
| Pose Estimation | ACID | AUC @ 5°6.9 | 11 | |
| Pose Estimation | RE10K | AUC @ 5°0.031 | 11 |