Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

pol

Benchmarks

Task NameDataset NameSOTA ResultTrend
Classificationpol
Accuracy79.6
36
Mislabel Detectionpol
AUROC0.97
17
Identifying mislabeled pointspol
F1 Score (pol)42
12
Identifying mislabeled pointspol
Precision32
12
Identifying mislabeled pointspol
Recall (pol)64
12
Marginal Likelihood EstimationPOL (mean over 10 splits)
Test Log-Likelihood1.27
12
ClassificationPol N=10,082 (full)
AUROC0.9959
9
RegressionPOL (test)
RMSE2.199
9
CASHpol (test)
Test Error1.34
9
RegressionPOL
Log Likelihood2.555
8
Pruning Boosted Tree EnsemblesPoL
Pruning Rate79.3
7
Point-level mislabeled data detectionpol
AUCPR93
7
Data Valuationpol
Valuation Runtime (s)0.23
5
Noisy Detectionpol
AUROC88
5
Tabular Classificationpol
Mean Accuracy99.33
5
Counterfactual ExplanationPol (test)
Mean L1 Distance9.1
4
Ensemble CompressionPOL
S Score20
4
Binary ClassificationPOL
R50039
3
p-robustness EstimationPOL
R50019
3
Verifiable Data Valuationpol
Proving Time (s)25.7
2
Cell-level outlier detectionpol
AUC82
2
Model AllocationPol
AUC98
1
Tabular Regressionpol (test)
Metric-
0
Showing 23 of 23 rows