Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Code Generation on HumanEval (Pass@4 Accuracy and GRPO Speedup)

77Accuracy (Pass@4)

DUET

32.844.27555.7567.225May 8, 2026
Updated 22d ago

Evaluation Results

MethodLinks
2026.05
772.38
2026.05
771.43
2026.05
75.20.63
2026.05
75.20.75
2026.05
73.61
2026.05
73.61.19
2026.05
62.8-
2026.05
530.95
2026.05
52.92.04
2026.05
52.81
2026.05
52.81.13
2026.05
52.71.62
2026.05
52.70.82
2026.05
52.11.26
2026.05
51.72.51
2026.05
50.5-
2026.05
48.90.95
2026.05
47.71.3
2026.05
47.61
2026.05
46.80.81
2026.05
34.5-