Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Uncertainty Estimation on MATH AutoGen (test)

0.7544AUROC

MATU

0.5176960.5791480.64060.702052Apr 9, 2026
Updated 5d ago

Evaluation Results

MethodLinks
2026.04
0.75440.4687
2026.04
0.71460.5334
2026.04
0.65820.622
2026.04
0.65240.5102
2026.04
0.63550.4512
2026.04
0.63340.3933
2026.04
0.62710.4571
2026.04
0.61110.4326
2026.04
0.60790.5931
2026.04
0.60640.383
2026.04
0.60150.5892
2026.04
0.59120.3802
2026.04
0.58980.5826
2026.04
0.57610.3679
2026.04
0.53850.409
2026.04
0.52680.399