MARCO: Navigating the Unseen Space of Semantic Correspondence

About

Recent advances in semantic correspondence rely on dual-encoder architectures, combining DINOv2 with diffusion backbones. While accurate, these billion-parameter models generalize poorly beyond training keypoints, revealing a gap between benchmark performance and real-world usability, where queried points rarely match those seen during training. Building upon DINOv2, we introduce MARCO, a unified model for generalizable correspondence driven by a novel training framework that enhances both fine-grained localization and semantic generalization. By coupling a coarse-to-fine objective that refines spatial precision with a self-distillation framework, which expands sparse supervision beyond annotated regions, our approach transforms a handful of keypoints into dense, semantically coherent correspondences. MARCO sets a new state of the art on SPair-71k, AP-10K, and PF-PASCAL, with gains that amplify at fine-grained localization thresholds (+8.9 PCK@0.01), strongest generalization to unseen keypoints (+5.1, SPair-U) and categories (+4.7, MP-100), while remaining 3x smaller and 10x faster than diffusion-based approaches. Code is available at https://github.com/visinf/MARCO .

Claudia Cuttano, Gabriele Trivigno, Carlo Masone, Stefan Roth• 2026

Related benchmarks

Task	Dataset	Result
Semantic Correspondence	SPair-71k (test)	PCK@0.187.2	146
Semantic Correspondence	PF-PASCAL	PCK @ alpha=0.196.9	116
Semantic Correspondence	SPair-71k	PCK @ 0.0127	40
Semantic Correspondence	AP-10K Intra-species (test)	PCK@0.1089.1	29
Semantic Correspondence	AP-10K cross-family	PCK@0.1083.4	21
Semantic Correspondence	AP-10K	PCK@0.1 (I.S.)89.1	15
Semantic Correspondence	SpairU	PCK@0.1067.5	11
Semantic Correspondence	AP-10K C.S.	PCK@0.1088.3	10
Semantic Correspondence	SPair-71k Geo-Aware	PCK@0.0122.8	9
Semantic Correspondence	SPair-U (Unseen keypoints)	Aeroplane Score86.6	8

Showing 10 of 12 rows

Other info

GitHub

Follow for update

@wizwand_team Discord