FlowCam: Training Generalizable 3D Radiance Fields without Camera Poses via Pixel-Aligned Scene Flow
About
Reconstruction of 3D neural fields from posed images has emerged as a promising method for self-supervised representation learning. The key challenge preventing the deployment of these 3D scene learners on large-scale video data is their dependence on precise camera poses from structure-from-motion, which is prohibitively expensive to run at scale. We propose a method that jointly reconstructs camera poses and 3D neural scene representations online and in a single forward pass. We estimate poses by first lifting frame-to-frame optical flow to 3D scene flow via differentiable rendering, preserving locality and shift-equivariance of the image processing backbone. SE(3) camera pose estimation is then performed via a weighted least-squares fit to the scene flow field. This formulation enables us to jointly supervise pose estimation and a generalizable neural scene representation via re-rendering the input video, and thus, train end-to-end and fully self-supervised on real-world video datasets. We demonstrate that our method performs robustly on diverse, real-world video, notably on sequences traditionally challenging to optimization-based pose estimation techniques.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| View Synthesis | CO3D-Hydrants (test) | LPIPS0.4143 | 12 | |
| View Synthesis | KITTI (test) | PSNR17.69 | 11 | |
| Pose Estimation | RealEstate-10K (Small) | Rotation Average Error (Avg)11.883 | 7 | |
| Pose Estimation | RealEstate-10K Medium | Rotation Average Error (Degrees)4.154 | 7 | |
| Pose Estimation | RealEstate-10K Large | Rotation Avg Error (°)2.349 | 7 | |
| Pose Estimation | RealEstate-10K (Avg) | Rotation Avg Error7.426 | 7 | |
| Pose Estimation | ACID Small | Rotation Avg Error (°)8.663 | 7 | |
| Pose Estimation | ACID Medium | Rotation Avg Error (°)8.778 | 7 | |
| Pose Estimation | ACID Large | Rotation Avg Error (°)9.305 | 7 | |
| Pose Estimation | ACID (Avg) | Rotation Avg Error (°)9.001 | 7 |