Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Mathematical and Scientific Reasoning on OlympiadBench

44.53Mean Accuracy

Qwen2.5 32B + RL (synthetic)

41.285242.127642.9743.8124Apr 13, 2026
Updated 5d ago

Evaluation Results

MethodLinks
2026.04
44.53
2026.04
41.41