Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Math Reasoning on MATH 500 (Acc, ∆Tok)

99.2Accuracy

Qwen3-Next-80B + THINKBRAKE

54.68866.24477.889.356Oct 1, 2025
Updated 1mo ago

Evaluation Results

MethodLinks
2025.10
99.2-21.6
2025.10
98.2-
2025.10
97.2-1.4
2025.10
97-
2025.10
96.9-7.7
2025.10
96.8-
2025.10
96.6-20.4
2025.10
96.4-17.9
2025.10
96-
2025.10
95.8-20.6
2025.10
95.4-14.4
2025.10
95.3-
2025.10
95.2-25.1
2025.10
94.6-31.4
2025.10
94.4-14.2
2025.10
93.8-
2025.10
92-23.2
2025.10
91.2-57.9
2025.10
91.2-31
2025.10
89.6-39.4
2025.10
89-49.8
2025.10
88.6-51.3
2025.10
87.6-100
2025.10
87.2-42.1
2025.10
87.2-44.5
2025.10
86.8-100
2025.10
85.4-100
2025.10
81.4-60.6
2025.10
78.4-100
2025.10
71.1-
2025.10
70.8-17.7
2025.10
70.4-17.2
2025.10
68.8-50.4
2025.10
66.4-100
2025.10
64.2-47.2
2025.10
56.4-100