| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| Chinese Kinship | PoT-LLM | Accuracy71.2 | 20 | 2mo ago | |
| VSR | QWEN2.5-VL-7B + PGT | Accuracy85.7 | 16 | 9d ago | |
| W-UP | THINKLITE-VL | Accuracy (%)98.3 | 16 | 9d ago | |
| Sort-of-CLEVR (test) | AiT-Base | Relational Accuracy80.03 | 11 | 3mo ago | |
| sort-of-CLEVR | Linear Transformer + D3 (w/ F) | Unary Accuracy99 | 8 | 3mo ago | |
| RAVEN (test) | IV-CL | Average Accuracy92.5 | 5 | 3mo ago | |
| Something-Else (Comp) | ORVIT | Top-1 Accuracy69.7 | 3 | 3mo ago | |
| Something-Else Base | ORVIT | Top-1 Accuracy0.871 | 3 | 3mo ago |