| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Symbolic Reasoning | Letter | Accuracy92.4 | 67 | |
| Classification | LETTER (test) | Accuracy96 | 45 | |
| Graph Classification | LETTER-L TU Dataset | Accuracy98 | 20 | |
| Graph Classification | LETTER-H TU Dataset | Accuracy81.6 | 20 | |
| LTL Instruction Following | Letter Finite-horizon (full) | Success Rate (SR)100 | 19 | |
| Constrained Clustering | letter | Success Rate100 | 18 | |
| Multiclass classification | letter (test) | Log Loss (Posterior)0.2656 | 18 | |
| Clustering | letter | CPU Time1.67 | 17 | |
| Clustering | letter | Clustering Inertia123,150.01 | 17 | |
| Binary Classification | Letter UCI (test) | Accuracy97.5 | 17 | |
| Outlier Detection | letter (historical) | AUROC90.09 | 17 | |
| Off-policy evaluation for classification error | letter | Bias-0.085 | 15 | |
| Outlier Detection | letter (Group II) | AUROC0.9009 | 14 | |
| Binary Classification | Letter (test) | AUC91.4 | 13 | |
| LTL Instruction Following | Letter Infinite-horizon (full) | µAcc7.13 | 10 | |
| LTL-guided Reinforcement Learning | Letter Finite-horizon (test) | Success Rate (SR)100 | 9 | |
| Classification | Letter pi+=0.8 UCI (test) | Accuracy97.6 | 9 | |
| Classification | Letter pi+=0.5 UCI (test) | Accuracy97.5 | 9 | |
| Classification | Letter (pi+=0.2) UCI (test) | Accuracy97.8 | 9 | |
| Continual Clustering | Letter | AI NMI57.1 | 8 | |
| Outlier Detection | letter (1600, 32) (full) | Recall33 | 7 | |
| Binary Classification | letter LIBSVM (test) | Average AUC0.811 | 7 | |
| Classification | LETTER (10-fold cross-val) | Test F1 Score65.9 | 7 | |
| Clustering | Letter | Accuracy (ACC)68 | 6 | |
| Off-Policy Evaluation | Letter (UCI) | MSE0.0018 | 6 |