| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Classification | pol | Accuracy79.6 | 36 | |
| Mislabel Detection | pol | AUROC0.97 | 17 | |
| Identifying mislabeled points | pol | F1 Score (pol)42 | 12 | |
| Identifying mislabeled points | pol | Precision32 | 12 | |
| Identifying mislabeled points | pol | Recall (pol)64 | 12 | |
| Marginal Likelihood Estimation | POL (mean over 10 splits) | Test Log-Likelihood1.27 | 12 | |
| Classification | Pol N=10,082 (full) | AUROC0.9959 | 9 | |
| Regression | POL (test) | RMSE2.199 | 9 | |
| CASH | pol (test) | Test Error1.34 | 9 | |
| Regression | POL | Log Likelihood2.555 | 8 | |
| Pruning Boosted Tree Ensembles | PoL | Pruning Rate79.3 | 7 | |
| Point-level mislabeled data detection | pol | AUCPR93 | 7 | |
| Data Valuation | pol | Valuation Runtime (s)0.23 | 5 | |
| Noisy Detection | pol | AUROC88 | 5 | |
| Tabular Classification | pol | Mean Accuracy99.33 | 5 | |
| Counterfactual Explanation | Pol (test) | Mean L1 Distance9.1 | 4 | |
| Ensemble Compression | POL | S Score20 | 4 | |
| Binary Classification | POL | R50039 | 3 | |
| p-robustness Estimation | POL | R50019 | 3 | |
| Verifiable Data Valuation | pol | Proving Time (s)25.7 | 2 | |
| Cell-level outlier detection | pol | AUC82 | 2 | |
| Model Allocation | Pol | AUC98 | 1 | |
| Tabular Regression | pol (test) | Metric- | 0 |