Beyond First-Order: Learning Riemannian Geometries for Invariant Visual Place Recognition
About
Visual Place Recognition (VPR) demands representations robust to drastic environmental and viewpoint shifts. Existing aggregation paradigms either depend on extensive supervised training or rely on first-order pooling, often struggling to preserve structural correlations under extreme shifts or incurring high adaptation costs. In this work, we propose Riemannian Invariant Aggregation (RIA), a unified geometric framework that explicitly models second-order scene structure on the Symmetric Positive Definite (SPD) manifold. By treating perturbations as tractable congruence transformations, RIA leverages geometry-aware Riemannian mappings to project covariance descriptors into a linearized Euclidean space, effectively preserving invariant structural components while suppressing noise. Extensive evaluations demonstrate that RIA achieves zero-shot performance comparable to supervised methods, and establishes state-of-the-art accuracy with simple fine-tuning, particularly in unstructured environments. The source code will be released.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Visual Place Recognition | Tokyo24/7 | Recall@189.2 | 229 | |
| Visual Place Recognition | Pitts30k | Recall@186.7 | 170 | |
| Visual Place Recognition | St Lucia | R@197.2 | 76 | |
| Visual Place Recognition | 17 Places | Recall@164.7 | 19 | |
| Visual Place Recognition | Gardens | Recall@197.5 | 8 | |
| Visual Place Recognition | Oxford | Recall@198.4 | 8 | |
| Visual Place Recognition | Baidu | Recall@167.5 | 8 |