Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Program Synthesis on CodeContests (test)
Loading...
0.2045
Pass@1
GRPO-RLVR
0.06826
0.10363
0.139
0.17437
Mar 17, 2026
Pass@1
Pass@128
Updated 1mo ago
Evaluation Results
Method
Method
Links
Pass@1
Pass@128
GRPO-RLVR
Backbone=QWEN2.5-72B
2026.03
0.2045
0.3697
LEAFE
Backbone=QWEN2.5-72B
2026.03
0.1712
0.4788
LEAFE
Backbone=LLAMA3-70B
2026.03
0.1409
0.3394
GRPO-RLVR
Backbone=LLAMA3-70B
2026.03
0.1364
0.2788
BASE (NO FT)
Backbone=QWEN2.5-72B
2026.03
0.1
0.3394
BASE (NO FT)
Backbone=LLAMA3-70B
2026.03
0.0735
0.2485
Feedback
Search any
task
Search any
task