| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Anomaly Detection | Fraud | AUC-PR0.925 | 31 | |
| Mislabel Detection | fraud | AUROC95.2 | 17 | |
| Identifying mislabeled points | fraud | F1 Score52 | 12 | |
| Identifying mislabeled points | fraud | Precision39 | 12 | |
| Identifying mislabeled points | fraud | Recall77 | 12 | |
| Anomaly Detection | Fraud Synthetic (full stream) | G-mean0.821 | 11 | |
| Anomaly Detection | Fraud Synthetic | Specificity99.8 | 11 | |
| Anomaly Detection | Fraud | Recall75.5 | 11 | |
| Anomaly Detection | fraud Out-of-Domain | F1 Score63.25 | 10 | |
| Anomaly Detection | Fraud Tabular | AUROC0.9323 | 9 | |
| Binary Classification | Fraud (holdout test) | BCEL0.003 | 7 | |
| Anomaly Detection | Fraud Stationary | AUC96.2 | 6 | |
| Text2Cypher generation | Fraud | Q@169 | 5 | |
| Data Valuation | fraud | Valuation Runtime (s)0.2 | 5 | |
| Tabular Perception | Fraud | F1 Score91.3 | 4 | |
| Fraud Detection | Fraud | Accuracy86.88 | 3 | |
| Verifiable Data Valuation | fraud | Proving Time (s)13.9 | 3 |