Share your thoughts, 1 month free Claude Pro on usSee more

Agentic Reasoning on BALROG

31.5Accuracy

TemplateRL

Updated 2mo ago

Evaluation Results

Method	Links
TemplateRL 2025.05		31.5
OpenReasoner-Zero 2025.05		28.3
Oat-Zero 2025.05		26.2
GRPO 2025.05		25.4
PRIME-Zero 2025.05		24.1
SimpleRL-Zero 2025.05		17.4
Qwen2.5-Math-7B-Instruct 2025.05		15.4
Qwen2.5-Math-7B-Base 2025.05		12.5