| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| Winoground 1.0 (test) | Human | Text Score89.5 | 23 | 12d ago | |
| SugarCrepe++ | Slipform | Accuracy66.24 | 20 | 1mo ago | |
| ARO | CE-CLIP+ | Accuracy0.804 | 14 | 3mo ago | |
| ARO Benchmark Visual Genome Flickr30k MS-COCO (test) | CapPa | VG Attribution89.3 | 11 | 3mo ago | |
| Winoground standard (test) | GPT-4o | Text Score75.5 | 7 | 3mo ago | |
| Winoground (test) | JAM (Spread) | Text Score61.3 | 7 | 16d ago | |
| ARO (test) | syn-CLIP | VG-Rel71.4 | 4 | 3mo ago | |
| Winoground clean (no-tag) | CyCLIP | Text Score32.16 | 2 | 3mo ago |