| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Abstract Visual Reasoning | ARC-AGI 1 | Accuracy (Pass@2)98 | 27 | |
| Abstract Visual Reasoning | ARC-AGI 2 | Accuracy (Pass@2)100 | 26 | |
| Abstraction and Reasoning | ARC-AGI 2 (public evaluation) | Pass@2100 | 13 | |
| Compositional Reasoning | ARC-AGI 2 | Accuracy33.6 | 11 | |
| Abstraction and Reasoning | ARC-AGI Public Training Set (Easy) (60 tasks) | Total Cost0.41 | 10 | |
| Reasoning | ARC-AGI 2 (test) | Accuracy43.3 | 10 | |
| Reasoning | ARC-AGI 2 | Accuracy50 | 9 | |
| Abstraction and Reasoning | ARC-AGI | ARC-1 Score58.2 | 9 | |
| Reasoning | ARC-AGI public evaluation set V2 | Accuracy97.9 | 6 | |
| Symbolic Reasoning | ARC-AGI 1 (test) | Pass@247.5 | 6 | |
| ARC-AGI | ARC-AGI (test) | Accuracy (ARC-AGI Test)67 | 5 | |
| Abstract and compositional reasoning | ARC-AGI 2 (test) | Accuracy (ARC-AGI 2 Test)51 | 4 | |
| Puzzle Solving | ARC-AGI 3 | Levels Won2 | 3 | |
| Symbolic Reasoning | ARC-AGI 2 (test) | Pass@19.9 | 3 | |
| Abstract Reasoning | ARC-AGI (concept evaluation) | Accuracy86.8 | 2 |