Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Mathematical Reasoning on AMC 2023 (Accuracy and Average Tokens)

100Accuracy

BCR-Qwen3-4B (Ours)

29.848.02566.2584.475Apr 2, 2026Apr 4, 2026Apr 7, 2026Apr 10, 2026Apr 12, 2026Apr 15, 2026Apr 18, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
2026.04
1007,128
2026.04
97.510,457
2026.04
97.58,648
2026.04
97.58,204
2026.04
97.55,431
2026.04
97.58,801
2026.04
97.57,712
2026.04
97.54,671
2026.04
955,295
2026.04
955,506
2026.04
954,712
2026.04
955,793
2026.04
955,456
2026.04
92.53,786
2026.04
906,149
2026.04
905,432
2026.04
87.52,637
2026.04
87.55,933
2026.04
858,157
2026.04
857,154
2026.04
855,713
2026.04
855,002
2026.04
859,425
2026.04
82.57,704
2026.04
82.56,856
2026.04
814,120
2026.04
8010,713
2026.04
807,021
2026.04
757,050
2026.04
72.55,706
2026.04
72.56,605
2026.04
67.59,702
2026.04
67.55,712
2026.04
42.5703
2026.04
32.5828