Telling Left from Right: Identifying Geometry-Aware Semantic Correspondence

About

While pre-trained large-scale vision models have shown significant promise for semantic correspondence, their features often struggle to grasp the geometry and orientation of instances. This paper identifies the importance of being geometry-aware for semantic correspondence and reveals a limitation of the features of current foundation models under simple post-processing. We show that incorporating this information can markedly enhance semantic correspondence performance with simple but effective solutions in both zero-shot and supervised settings. We also construct a new challenging benchmark for semantic correspondence built from an existing animal pose estimation dataset, for both pre-training validating models. Our method achieves a PCK@0.10 score of 65.4 (zero-shot) and 85.6 (supervised) on the challenging SPair-71k dataset, outperforming the state of the art by 5.5p and 11.0p absolute gains, respectively. Our code and datasets are publicly available at: https://telling-left-from-right.github.io/.

Junyi Zhang, Charles Herrmann, Junhwa Hur, Eric Chen, Varun Jampani, Deqing Sun, Ming-Hsuan Yang• 2023

Related benchmarks

Task	Dataset	Result
Semantic Correspondence	SPair-71k (test)	PCK@0.185.6	146
Semantic Correspondence	PF-PASCAL	PCK @ alpha=0.195.7	107
Semantic Correspondence	PF-Pascal (test)	PCK@0.195	106
Semantic Correspondence	SPair-71k	Φ_bbox @ α=0.161.3	29
Semantic Correspondence	SPair-71k	Aero Accuracy92	23
Semantic Correspondence	SPair-71k	PCK @ 0.0122	22
Semantic Correspondence	AP-10K Intra-species (test)	PCK@0.1087.7	22
Semantic Correspondence	AP-10K	PCK@0.1 (I.S.)87.7	15
Semantic Matching	SPair-71k	PCK@0.0575.3	14
Semantic Correspondence	AP-10K cross-family	PCK@0.1078.5	14

Showing 10 of 22 rows

Other info

Code

Follow for update

@wizwand_team Discord