| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Downstream Repair | Arithmetic trace-level 60/20/20 (test) | Repair Accuracy92.4 | 16 | |
| Circuit Localization | Arithmetic 1.0 (test) | CPR1.017 | 9 | |
| Causal Variable Identification | Arithmetic | F1 (X)88.2 | 7 | |
| Outcome Reasoning | Arithmetic | M' F1 Mean87.8 | 7 | |
| Circuit Localization | Arithmetic | CPR1.09 | 6 | |
| Arithmetic | Arithmetic | Accuracy97.8 | 5 | |
| Mathematical Reasoning | Arithmetic | Accuracy (Easy)71.764 | 4 | |
| Linear Concept Accessibility and Steering | Arithmetic | Peak Alin0.995 | 4 | |
| Mathematical Reasoning | Arithmetic (val) | Accuracy83 | 3 | |
| Arithmetic | Arithmetic | HQ Neuron Count273 | 2 | |
| Arithmetic | Arithmetic concept family | Accuracy0 | 2 | |
| Linear accessibility probing | Arithmetic concept family | Top-1 Accuracy65.7 | 1 |