Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Code Reasoning on LiveCodeBench (Avg@16, Pass@16)
Loading...
43.2
Avg@16
ResRL
28.952
32.651
36.35
40.049
Feb 23, 2026
Mar 6, 2026
Mar 17, 2026
Mar 28, 2026
Apr 8, 2026
Apr 19, 2026
May 1, 2026
Avg@16
Pass@16
Updated 29d ago
Evaluation Results
Method
Method
Links
Avg@16
Pass@16
ResRL
Backbone=Qwen3-4B
2026.05
43.2
59.9
FlowRL
Backbone=Qwen3-4B
2026.05
42.4
58.7
DAPO
Backbone=Qwen3-4B
2026.05
41
52.3
GRPO
Backbone=Qwen3-4B
2026.05
39.5
55.1
LAD
2026.02
33.51
51.97
FlowRL
2026.02
33.24
51.61
NSR
Backbone=Qwen3-4B
2026.05
32.8
52.3
EntAdv
2026.02
32.75
50.9
GRPO
2026.02
32.46
51.25
ClipCov
2026.02
32.1
51.97
Backbone
Backbone=Qwen3-4B
2026.05
30.5
40.9
KLCov
2026.02
29.5
50.18
Feedback
Search any
task
Search any
task