| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Visual Reasoning | VisualPuzzles OOD (test) | Overall Accuracy47.95 | 8 | |
| Multimodal Visual Logic Reasoning | VisualPuzzles | Mean@558.2 | 8 | |
| Visual logic | VisualPuzzles | Top-1 Accuracy43.15 | 7 | |
| Visual Reasoning | VisualPuzzles | Algorithmic Score37.4 | 3 |