Telling Left from Right: Identifying Geometry-Aware Semantic Correspondence
About
While pre-trained large-scale vision models have shown significant promise for semantic correspondence, their features often struggle to grasp the geometry and orientation of instances. This paper identifies the importance of being geometry-aware for semantic correspondence and reveals a limitation of the features of current foundation models under simple post-processing. We show that incorporating this information can markedly enhance semantic correspondence performance with simple but effective solutions in both zero-shot and supervised settings. We also construct a new challenging benchmark for semantic correspondence built from an existing animal pose estimation dataset, for both pre-training validating models. Our method achieves a PCK@0.10 score of 65.4 (zero-shot) and 85.6 (supervised) on the challenging SPair-71k dataset, outperforming the state of the art by 5.5p and 11.0p absolute gains, respectively. Our code and datasets are publicly available at: https://telling-left-from-right.github.io/.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Semantic Correspondence | SPair-71k (test) | PCK@0.185.6 | 122 | |
| Semantic Correspondence | PF-Pascal (test) | PCK@0.195 | 106 | |
| Semantic Correspondence | PF-PASCAL | PCK @ alpha=0.195.7 | 98 | |
| Semantic Correspondence | SPair-71k | Φ_bbox @ α=0.161.3 | 29 | |
| Semantic Correspondence | SPair-71k | Aero Accuracy92 | 23 | |
| Semantic Matching | SPair-71k | PCK@0.0575.3 | 14 | |
| Semantic Correspondence | AP-10K Intra-species (test) | PCK@0.0123.2 | 12 | |
| Semantic Correspondence | AP-10K Cross-species (test) | PCK@0.010.217 | 12 | |
| Semantic Correspondence | SPair-71k | PCK @ 0.0122 | 11 | |
| Semantic Matching | SPair-71k | PCK @ alpha_bbox (0.1)82.9 | 9 |