Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

BoQ: A Place is Worth a Bag of Learnable Queries

About

In visual place recognition, accurately identifying and matching images of locations under varying environmental conditions and viewpoints remains a significant challenge. In this paper, we introduce a new technique, called Bag-of-Queries (BoQ), which learns a set of global queries designed to capture universal place-specific attributes. Unlike existing methods that employ self-attention and generate the queries directly from the input features, BoQ employs distinct learnable global queries, which probe the input features via cross-attention, ensuring consistent information aggregation. In addition, our technique provides an interpretable attention mechanism and integrates with both CNN and Vision Transformer backbones. The performance of BoQ is demonstrated through extensive experiments on 14 large-scale benchmarks. It consistently outperforms current state-of-the-art techniques including NetVLAD, MixVPR and EigenPlaces. Moreover, as a global retrieval technique (one-stage), BoQ surpasses two-stage retrieval methods, such as Patch-NetVLAD, TransVPR and R2Former, all while being orders of magnitude faster and more efficient. The code and model weights are publicly available at https://github.com/amaralibey/Bag-of-Queries.

Amar Ali-Bey, Brahim Chaib-draa, Philippe Gigu\`ere• 2024

Related benchmarks

TaskDatasetResultRank
Visual Place RecognitionMSLS (val)
Recall@193.8
305
Visual Place RecognitionTokyo24/7
Recall@198.1
229
Visual Place RecognitionPitts30k
Recall@193.7
170
Visual Place RecognitionPitts250k
Recall@196.6
163
Visual Place RecognitionNordland
Recall@190.6
163
Visual Place RecognitionMSLS Challenge
Recall@179
156
Visual Place RecognitionSPED
Recall@192.5
118
Visual Place RecognitionPittsburgh30k (test)
Recall@193.7
106
Visual Place RecognitionAmsterTime
Recall@163
100
Visual Place RecognitionSt Lucia
R@1100
76
Showing 10 of 32 rows

Other info

Code

Follow for update