Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Mathematical Reasoning on AMC (Pass@N and Token Usage)

56.8Mean@16

OptPO-GRPO

9.16821.53433.946.266Dec 2, 2025
Updated 4d ago

Evaluation Results

MethodLinks
2025.12
56.877.157.8-49.06
2025.12
5679.557.8-43.5
2025.12
54.379.555.4--
2025.12
53.279.555.4-41.66
2025.12
52.980.755.4--
2025.12
52.178.354.2--
2025.12
48.581.954.2-26.93
2025.12
48.183.155.4--
2025.12
48.183.153-26.8
2025.12
47.783.154.2--
2025.12
39.381.949.4--
2025.12
38.381.944.6-27.42
2025.12
17.154.222.9-41.76
2025.12
15.255.416.9-78.36
2025.12
14.951.818.1--
2025.12
13.955.419.3--
2025.12
11.454.216.9--
2025.12
1151.813.3-18.96