Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Mathematical Reasoning on Minerva Math (Accuracy, Generation Length)

68.4Accuracy

BCR-Qwen3-4B (Ours)

8.70424.20239.755.198Dec 18, 2025Jan 5, 2026Jan 23, 2026Feb 11, 2026Mar 1, 2026Mar 19, 2026Apr 7, 2026
Updated 9d ago

Evaluation Results

MethodLinks
2026.04
68.43,338
2026.04
66.24,576
2026.04
57.75,714
2026.04
54.45,201
2026.04
49.14,205
2026.04
48.57,575
2026.04
48.52,494
2026.04
46.75,678
2026.04
45.63,806.2
2026.04
44.92,593.1
2026.04
43.83,027.3
2026.04
43.82,742.5
2026.04
43.45,413
2026.04
43.42,429.9
2026.04
40.8816.7
2026.04
40.41,957.3
2026.04
40.4700.8
2026.04
40.1948.6
2026.01
39.73,909
2026.04
39.3965.9
2026.04
39742.2
2026.01
38.21,504
2026.04
37.9815.2
2026.01
37.51,761
2026.01
36.81,024
2026.04
34.6832.7
2026.01
33.81,181
2026.04
33.51,522.3
2026.04
33.51,059.5
2026.04
33.11,168.4
2026.01
32.7956
2025.12
32.08-
2026.04
31.61,733.2
2026.04
30.94,903.2
2026.01
30.74,948
2025.12
30.24-
2026.04
30.11,657.1
2026.04
30.11,026.5
2026.04
30.11,044.2
2026.04
29853.6
2026.04
28.31,673.2
2026.04
27.93,849
2026.04
27.91,263.9
2026.01
27.6862
2026.04
27.6857.9
2026.04
27.22,638.1
2025.12
26.93-
2026.01
23.53,021
2026.04
23.5777
2026.01
22.1872
2026.04
20.23,014.8
2026.04
16.51,315.8
2026.01
11.41,037
2026.04
111,032.2