| AbsR | KALE | AbsR Accuracy83.41 | | 56 | 3mo ago |
| I-RAVEN (test) | DIRCR | Accuracy (Overall)97.8 | | 21 | 1mo ago |
| ARC-AGI 2 | | Pass@229.4 | | 16 | 1d ago |
| ARC 1 | | Pass@279.6 | | 12 | 1d ago |
| ARC-AGI v1 (test) | | Accuracy98 | | 12 | 14d ago |
| ARC-AGI v2 (test) | | Accuracy100 | | 11 | 14d ago |
| Bongard-LOGO (BD) | C–G (AD) | Mean Accuracy72 | | 10 | 1mo ago |
| Bongard-LOGO (FF) | C–G (AD) | Mean Accuracy79.3 | | 10 | 1mo ago |
| Bongard-LOGO (HD) | C–G + Concept (AP + C) | Mean Accuracy61.1 | | 8 | 1mo ago |
| CLEVR-RPM (test) | NeSyCoCo | Accuracy100 | | 7 | 3mo ago |
| MMStar | Nash | Accuracy63.21 | | 5 | 14d ago |
| Private human-curated 177 ARC-style tasks (evaluation set) | Mini-Arch | Pass@k55.93 | | 5 | 2mo ago |
| PGM Triples (test) | VAE-WReN | Accuracy24.6 | | 5 | 3mo ago |
| PGM Triples (val) | Neural Interpreters | Accuracy79.9 | | 5 | 3mo ago |
| PGM Triple Pairs (test) | Neural Interpreters | Accuracy45.2 | | 5 | 3mo ago |
| PGM Triple Pairs (val) | Neural Interpreters | Accuracy68.6 | | 5 | 3mo ago |
| PGM Attribute Pairs (test) | VAE-WReN | Accuracy36.8 | | 5 | 3mo ago |
| PGM Attribute Pairs (val) | VAE-WReN | Accuracy70.1 | | 5 | 3mo ago |
| PGM Neutral (test) | Neural Interpreters | Accuracy77 | | 5 | 3mo ago |
| PGM Neutral (val) | Neural Interpreters | Accuracy77.3 | | 5 | 3mo ago |
| ARC-AGI-3 25 Public Games v2 | | RHAE0 | | 4 | 7d ago |
| ARC-AGI | GPT-4 + CoT | Frugality Index (Fp)3.54 | | 4 | 19d ago |
| PGM Extrapolation (test) | Neural Interpreters | Accuracy19.4 | | 4 | 3mo ago |
| PGM Extrapolation (val) | ViT | Accuracy92.2 | | 4 | 3mo ago |
| PGM Interpolation (test) | Neural Interpreters | Accuracy70.5 | | 4 | 3mo ago |