Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Scientific Reasoning on JEEBench

52.28Mean Accuracy

Qwen2.5 32B + RL (synthetic)

33.66438.49743.3348.163Apr 13, 2026
Updated 5d ago

Evaluation Results

MethodLinks
2026.04
52.28
2026.04
34.38