| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Symbolic Reasoning | Letter | Accuracy92.4 | 67 | |
| Classification | LETTER (test) | Accuracy96 | 52 | |
| Online Class-Incremental Learning | Letter | Final Mean Accuracy89.5 | 26 | |
| Model Extraction | Letter-low | Fidelity89.2 | 24 | |
| Graph Classification | LETTER-L TU Dataset | Accuracy98 | 20 | |
| Graph Classification | LETTER-H TU Dataset | Accuracy81.6 | 20 | |
| LTL Instruction Following | Letter Finite-horizon (full) | Success Rate (SR)100 | 19 | |
| Constrained Clustering | letter | Success Rate100 | 18 | |
| Multiclass classification | letter (test) | Log Loss (Posterior)0.2656 | 18 | |
| Clustering | Letter | ARI0.189 | 18 | |
| Clustering | letter | CPU Time1.67 | 17 | |
| Clustering | letter | Clustering Inertia123,150.01 | 17 | |
| Binary Classification | Letter UCI (test) | Accuracy97.5 | 17 | |
| Outlier Detection | letter (historical) | AUROC90.09 | 17 | |
| Off-policy evaluation for classification error | letter | Bias-0.085 | 15 | |
| Continual Clustering | Letter | AI NMI57.1 | 15 | |
| Outlier Detection | letter (Group II) | AUROC0.9009 | 14 | |
| Binary Classification | Letter (test) | AUC91.4 | 13 | |
| Online Class Incremental Learning | Letter | Average Forgetting4.1 | 11 | |
| LTL Instruction Following | Letter Infinite-horizon (full) | µAcc7.13 | 10 | |
| LTL-guided Reinforcement Learning | Letter Finite-horizon (test) | Success Rate (SR)100 | 9 | |
| Classification | Letter pi+=0.8 UCI (test) | Accuracy97.6 | 9 | |
| Classification | Letter pi+=0.5 UCI (test) | Accuracy97.5 | 9 | |
| Classification | Letter (pi+=0.2) UCI (test) | Accuracy97.8 | 9 | |
| Nonstationary backward transfer | Letter | BWT-ARI-0.038 | 7 |