Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Mathematical Reasoning on MATH 500 (Accuracy Variance, Consistency, Token Variance)

79.7Mean Accuracy

Gemma-3-12B-Instruct

22.70837.50452.367.096Dec 2, 2025
Updated 3mo ago

Evaluation Results

MethodLinks
2025.12
79.70.001655.4144,471.24
2025.12
68.60.00735.4122,866.7
2025.12
670.00249.5153,144.84
2025.12
58.50.00735.4305,133.92
2025.12
28.50.0261.11,547,849
2025.12
24.90.0152.71,512,173.79