| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Visual Spatial Reasoning | VSR | Accuracy88.87 | 59 | |
| Spatial Reasoning | VSR | LLM-Judge Accuracy92.61 | 28 | |
| Visual Question Answering | VSR | Top-1 Accuracy73.77 | 26 | |
| Spatial Relationship Understanding | VSR | Overall Accuracy73.9 | 17 | |
| Relational Reasoning | VSR | Accuracy85.7 | 16 | |
| 2D Spatial Reasoning | VSR | Accuracy75.6 | 10 | |
| Spatial Reasoning | VSR (ood) | Accuracy84.8 | 10 | |
| Spatial Reasoning | VSR (Visual Spatial Reasoning) | Binary Robust Acc75.8 | 9 | |
| Spatial Und. (Mono.) | VSR (test) | Accuracy81.05 | 9 | |
| Directional attribution | VSR (n=240) | DAE96.8 | 8 | |
| Visual Spatial Reasoning | VSR ZOOM-Hard | GPT Accuracy55.98 | 6 | |
| Visual Spatial Reasoning | VSR ZOOM-Medium | GPT Accuracy67.63 | 6 | |
| Visual Spatial Reasoning | VSR (ZOOM-Easy) | GPT Accuracy73.09 | 6 | |
| Confidence estimation | VSR (test) | AUROC67.4 | 6 | |
| Spatial Reasoning | VSR zero-shot (test) | Accuracy (zero-shot)63.67 | 6 | |
| General | VSR | Score80.6 | 3 |