Share your thoughts, 1 month free Claude Pro on usSee more

Mathematical Reasoning on AIME25 (Accuracy, Efficiency k)

46.7Accuracy (%)

Qwen3-4B-Inst-2507

Updated 4mo ago

Evaluation Results

Method	Links
Qwen3-4B-Inst-2507 2026.02		46.7	1
Qwen3-4B-Inst-2507 2026.02		26.7	2.3
Qwen3-4B-Inst-2507 2026.02		23.3	1
Qwen3-4B-Inst-2507 2026.02		23.3	2.8
L3.1-8B-Magpie 2026.02		0	1
L3.1-8B-Magpie 2026.02		0	1
L3.1-8B-Magpie 2026.02		0	5.5