ImLoc: Revisiting Visual Localization with Image-based Representation
About
Existing visual localization methods are typically either 2D image-based, which are easy to build and maintain but limited in effective geometric reasoning, or 3D structure-based, which achieve high accuracy but require a centralized reconstruction and are difficult to update. In this work, we revisit visual localization with a 2D image-based representation and propose to augment each image with estimated depth maps to capture the geometric structure. Supported by the effective use of dense matchers, this representation is not only easy to build and maintain, but achieves highest accuracy in challenging conditions. With compact compression and a GPU-accelerated LO-RANSAC implementation, the whole pipeline is efficient in both storage and computation and allows for a flexible trade-off between accuracy and highest memory efficiency. Our method achieves a new state-of-the-art accuracy on various standard benchmarks and outperforms existing memory-efficient methods at comparable map sizes. Code will be available at https://github.com/cvg/Hierarchical-Localization.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Visual Localization | Aachen Day-Night v1.1 (Night) | Success Rate (0.25m, 2°)74.3 | 58 | |
| Visual Localization | Aachen Day-Night v1.1 (Day) | SR (0.25m, 2°)89.3 | 50 | |
| Visual Localization | Cambridge Landmarks | King's Positional Error (cm)11 | 28 | |
| Visual Relocalization | Oxford RobotCar Night Queries | Bodleian Library Acc (0.25m, 2°)79.85 | 8 | |
| Visual Relocalization | Oxford RobotCar Day Queries | Acc (Bodleian Library, 0.25m, 2°)98.4 | 7 | |
| Visual Localization | LaMAR Phone queries (val) | CAB Recall (1°, 0.1m)58.84 | 3 | |
| Visual Localization | LaMAR Phone queries (test) | CAB Recall (1°, 0.1m)63.8 | 3 | |
| Visual Localization | LaMAR (val) | CAB Recall (1°, 0.1m)86.79 | 3 | |
| Visual Localization | LaMAR (test) | CAB Recall (1°, 0.1m)80.53 | 3 |