| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Component-level attribution | IOI | Dissimilarity (dis.)0 | 32 | |
| Circuit Discovery | IOI | AUC83.6 | 12 | |
| MCQ Classification | IOI 2 v1 (Eva) | Accuracy100 | 6 | |
| MCQ Classification | IOI v1 (Infer) | Accuracy1 | 6 | |
| Circuit Discovery | IOI | KL Div0.668 | 6 | |
| Competitive Programming | IOI 2025 | Score S100 | 4 | |
| Autointerpretation | IOI | Accuracy76 | 4 | |
| Intrinsic Cluster Quality Evaluation | IOI Pythia-160M | Mean Silhouette Score0.07 | 3 | |
| Intrinsic Cluster Quality Evaluation | IOI | Silhouette Score (Mean)0.03 | 3 | |
| Circuit Discovery | IOI | Sparsity96.74 | 3 | |
| Circuit Discovery | IOI 400 examples v1 | KL Divergence0.22 | 3 | |
| Circuit Discovery | IOI 200 examples v1 | KL Divergence0.25 | 3 | |
| Code Reasoning | IOI 2025 | Score439.28 | 2 | |
| Circuit Discovery | IOI 100K examples v1 | KL Divergence0.2 | 2 |