Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Uncertainty Estimation on MMLU AutoGen (test)

0.7315AUROC

MATU

0.4926120.5546310.616650.678669Apr 9, 2026
Updated 6d ago

Evaluation Results

MethodLinks
2026.04
0.73150.8833
2026.04
0.71930.8589
2026.04
0.71050.8617
2026.04
0.68670.8516
2026.04
0.65560.8363
2026.04
0.64840.8552
2026.04
0.62770.5841
2026.04
0.59810.5649
2026.04
0.59540.4745
2026.04
0.58020.5528
2026.04
0.57750.4368
2026.04
0.57590.5438
2026.04
0.55210.4288
2026.04
0.53160.3762
2026.04
0.51380.3031
2026.04
0.50180.2973