Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Mathematical Reasoning on Math dataset

85.7Accuracy

Cloud-Only

33.90847.35460.874.246May 29, 2026
Updated 1d ago

Evaluation Results

MethodLinks
2026.05
85.7---
2026.05
80730.66137
2026.05
73.3961.01320
2026.05
70.2---
2026.05
69.4---
2026.05
64.1---
2026.05
60---
2026.05
57.31560.1979
2026.05
39.6---
2026.05
35.9---