| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Compositional Reasoning | VL-Checklist | Attribute Score81.8 | 37 | |
| Image-Text Matching | VL-Checklist | AURC0.258 | 23 | |
| Vision-Language Probing | VL-CheckList (test) | Object: Avg86.9 | 17 | |
| Hard-negative classification | VL-CheckList Object | Score (Center)71.2 | 6 | |
| Vision-Language Alignment | VL-Checklist VG-spatial | Accuracy63.5 | 4 |