DFNet: Enhance Absolute Pose Regression with Direct Feature Matching
About
We introduce a camera relocalization pipeline that combines absolute pose regression (APR) and direct feature matching. By incorporating exposure-adaptive novel view synthesis, our method successfully addresses photometric distortions in outdoor environments that existing photometric-based methods fail to handle. With domain-invariant feature matching, our solution improves pose regression accuracy using semi-supervised learning on unlabeled data. In particular, the pipeline consists of two components: Novel View Synthesizer and DFNet. The former synthesizes novel views compensating for changes in exposure and the latter regresses camera poses and extracts robust features that close the domain gap between real images and synthetic ones. Furthermore, we introduce an online synthetic data generation scheme. We show that these approaches effectively enhance camera pose estimation both in indoor and outdoor scenes. Hence, our method achieves a state-of-the-art accuracy by outperforming existing single-image APR methods by as much as 56%, comparable to 3D structure-based methods.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Camera Localization | 7 Scenes | Average Position Error (m)0.02 | 46 | |
| Visual Localization | 7Scenes (test) | Chess Median Angular Error (°)0.57 | 41 | |
| Visual Localization | Cambridge Landmarks (test) | Avg Median Positional Error (m)0.35 | 35 | |
| Camera Relocalization | 7-Scenes (test) | Median Translation Error (cm)6 | 30 | |
| Visual Localization | 7scenes indoor | Positional Error (Chess, cm)3 | 30 | |
| Visual Localization | Cambridge Landmarks | King's Positional Error (cm)37 | 28 | |
| Pose Estimation | 7 Scenes | Average Median Translation Error (m)0.02 | 23 | |
| Visual Localization | Cambridge Landmarks College | Median Translation Error (m)0.73 | 23 | |
| Visual Localization | Cambridge Landmarks Church | Median Translation Error (m)1.37 | 23 | |
| Camera Relocalization | Cambridge Landmarks (test) | Median Translation Error (cm)119 | 22 |