Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Mathematical Reasoning on Omni-MATH (Calibration Metrics)

0.0883ECE

MSV 16

0.0738160.1715830.269350.367117Mar 3, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
2026.03
0.0883-0.3888
2026.03
0.0959-0.4206
2026.03
0.103-0.455
2026.03
0.103-0.455
2026.03
0.103-0.455
2026.03
0.1104-0.4407
2026.03
0.1104-0.4407
2026.03
0.1104-0.4407
2026.03
0.1157-0.4974
2026.03
0.12060.2681-
2026.03
0.13030.3434-
2026.03
0.1316-0.4296
2026.03
0.14460.292-
2026.03
0.1656-0.5194
2026.03
0.16950.341-
2026.03
0.17760.3583-
2026.03
0.22980.3997-
2026.03
0.25140.4372-
2026.03
0.28880.5115-
2026.03
0.2920.5175-
2026.03
0.31950.5521-
2026.03
0.32010.554-
2026.03
0.32960.5635-
2026.03
0.35420.617-
2026.03
0.41850.7348-
2026.03
0.43540.7306-
2026.03
0.43960.7359-
2026.03
0.45040.8118-