| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| Addition | Accuracy99.5 | 7 | 3mo ago | ||
| Addition synthetic (val) | Skill-curriculum | Skill 1 Score0.016 | 7 | 3mo ago | |
| Common Numeracy Benchmarks | RMSE [0, 10^2]0.64 | 5 | 2mo ago | ||
| Add (test) | Relational Memory Core | Per-Char Accuracy99.9 | 4 | 3mo ago | |
| CLEVR-Addition 7 objects (extrapolation) | DeepObjectLog | Task Accuracy59.81 | 3 | 16d ago | |
| CLEVR-Addition (test) | MESH | Task Accuracy96.97 | 3 | 16d ago | |
| Addition 200 prompts Qwen3-8B (test) | Probe-round | Accuracy100 | 3 | 28d ago | |
| CLEVR-Addition (OOD Class) | DeepObjectLog | Task Accuracy28.57 | 1 | 16d ago |