Share your thoughts, 1 month free Claude Pro on usSee more

Code Generation on HumanEval (Accuracy@4)

70Accuracy@4

PAPO

Updated 3mo ago

Evaluation Results

Method	Links
PAPO 2026.03		70
ORM(GRPO) 2026.03		66
PAPO 2026.03		63.9
PAPO 2026.03		63.9
ORM(GRPO) 2026.03		61.6
Base 2026.03		56.9
Base 2026.03		54.6
ORM(DAPO) 2026.03		53.8
PAPO 2026.03		52.5
Base 2026.03		49.1
ORM(GRPO) 2026.03		39.3
Base 2026.03		35.3