Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

EffoVPR: Effective Foundation Model Utilization for Visual Place Recognition

About

The task of Visual Place Recognition (VPR) is to predict the location of a query image from a database of geo-tagged images. Recent studies in VPR have highlighted the significant advantage of employing pre-trained foundation models like DINOv2 for the VPR task. However, these models are often deemed inadequate for VPR without further fine-tuning on VPR-specific data. In this paper, we present an effective approach to harness the potential of a foundation model for VPR. We show that features extracted from self-attention layers can act as a powerful re-ranker for VPR, even in a zero-shot setting. Our method not only outperforms previous zero-shot approaches but also introduces results competitive with several supervised methods. We then show that a single-stage approach utilizing internal ViT layers for pooling can produce global features that achieve state-of-the-art performance, with impressive feature compactness down to 128D. Moreover, integrating our local foundation features for re-ranking further widens this performance gap. Our method also demonstrates exceptional robustness and generalization, setting new state-of-the-art performance, while handling challenging conditions such as occlusion, day-night transitions, and seasonal variations.

Issar Tzachor, Boaz Lerner, Matan Levy, Michael Green, Tal Berkovitz Shalev, Gavriel Habib, Dvir Samuel, Noam Korngut Zailer, Or Shimshi, Nir Darshan, Rami Ben-Ari• 2024

Related benchmarks

TaskDatasetResultRank
Visual Place RecognitionMSLS (val)
Recall@192.8
236
Visual Place RecognitionPitts30k
Recall@193.9
164
Visual Place RecognitionTokyo24/7
Recall@198.7
146
Visual Place RecognitionMSLS Challenge
Recall@179
134
Visual Place RecognitionNordland
Recall@195
112
Visual Place RecognitionSPED
Recall@193.1
106
Visual Place RecognitionPittsburgh30k (test)
Recall@194.8
86
Visual Place RecognitionAmsterTime
Recall@165.5
83
Visual Place RecognitionSt Lucia
R@1100
76
Visual Place RecognitionNordland
Recall@195
72
Showing 10 of 24 rows

Other info

Follow for update