| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Classification | Vehicle | Accuracy85.6 | 65 | |
| Confidence Calibration | vehicle | Calibration Error0.001 | 44 | |
| Class Prior Estimation | Vehicle | Estimation Error1.6 | 36 | |
| Multi-class classification | Vehicle | F1-score83.7 | 20 | |
| Mislabel Detection | vehicle | AUROC0.853 | 17 | |
| Classification | vehicle | Cohen's Kappa0.989 | 16 | |
| Off-policy evaluation for classification error | vehicle | Bias-0.027 | 15 | |
| Classification | vehicle | ROC AUC100 | 14 | |
| Identifying mislabeled points | vehicle | F1 Score21 | 12 | |
| Identifying mislabeled points | vehicle | Precision16 | 12 | |
| Identifying mislabeled points | vehicle | Recall32 | 12 | |
| Classification | Vehicle 10% | PR AUC95.8 | 12 | |
| Classification | Vehicle | PR AUC98.1 | 12 | |
| Counterfactual Explanation Generation | vehicle | Validity100 | 12 | |
| Multiclass Classification | vehicle | Weighted F1-score81 | 9 | |
| Multiclass imbalanced classification | vehicle | AUC94.2 | 9 | |
| Multiclass imbalanced classification | vehicle | Accuracy81.8 | 9 | |
| Multiclass Imbalanced Classification | vehicle | G-Mean79.8 | 9 | |
| Tabular Classification | vehicle 54 (test) | Test Error (%)11.1 | 9 | |
| Classification | Vehicle (UCI) (test) | NLL0.422 | 9 | |
| Counterfactual Explanation | vehicle | Validity1 | 8 | |
| Active Learning | Vehicle | AULC83.3 | 8 | |
| Classification | vehicle 54 | AUROC96.8 | 8 | |
| Evasion attack verification | vehicle 10,000 examples (test) | Speedup19.9 | 8 | |
| Classification | vehicle | Balanced Accuracy83.793 | 8 |