Share your thoughts, 1 month free Claude Pro on usSee more

STEM Reasoning on TheoremQA (Avg@2)

55.4Avg@2

RLPR

Updated 4mo ago

Evaluation Results

Method	Links
RLPR 2026.01		55.4
DARL 2026.01		55.2
Oat-Zero 2026.01		53.3
RLVR 2026.01		52.2
General Reasoner 2026.01		52.1
SimpleRL-Zoo 2026.01		51.1
SimpleRL-Zoo 2026.01		49.5
TTRL 2026.01		48.8
PRIME 2026.01		47.7
VeriFree 2026.01		47.6
Qwen2.5-7B-Inst 2026.01		47.3
Qwen2.5-7B 2026.01		41.4
DARL 2026.01		39.4
RLPR 2026.01		36.5
RLVR 2026.01		32
Llama3.1-8B-Inst 2026.01		31.3