SupScene: Scene-Structured Overlap Supervision for Image Retrieval in Unconstrained SfM

About

Image retrieval is a critical step for reducing the quadratic cost of image matching in unconstrained Structure-from-Motion (SfM). Unlike generic image retrieval, however, the relevant goal of SfM is to identify geometrically matchable image pairs rather than merely semantically similar images. Prevailing methods are largely trained under anchor-centric tuple guidance, which organizes the training around isolated tuples and under-utilizes the dense, graded overlap structure naturally established within a SfM scene. In this work, we present SupScene, a scene-structured training framework that samples connected local subgraphs from SfM overlap graphs and jointly supervises all valid within-subgraph pairwise relations. To explicitly align the trained descriptor with geometric co-visibility, we further introduce an overlap-ordered objective that combines multi-similarity optimization with a continuous relative-overlap ranking term. In addition, the proposed framework is instantiated with a lightweight Structural Context Probe Pooling (SCPP) head that aggregates complementary structural responses into a compact global descriptor. Extensive experimental results on multiple benchmarks demonstrate that our method can significantly improve overall retrieval performance and enhance the completeness of downstream SfM reconstructions. Code and models are available at https://github.com/Suxilan/SupScene.

Xulei Shi, Maoyu Wang, Yuning Peng, Guanbo Wang, Xin Wang, Yifan Liao, Qi Chen, Pengjie Tao• 2026

Related benchmarks

Task	Dataset	Result
Image Retrieval	GL3D official (test)	Recall@2573	6
Structure-from-Motion	1DSfM Gendarmenmarkt	Registered Images1.05e+3	4
Structure-from-Motion	1DSfM Madrid Metropolis	Registered Images491	4
Structure-from-Motion	1DSfM Alamo	Registered Images972	4

Showing 4 of 4 rows

Other info

Follow for update

@wizwand_team Discord