Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Code Generation on HumanEval (r* pass and r_self)

73.4r* Pass Rate

Phi / Base

30.34441.52252.763.878May 6, 2026
Updated 27d ago

Evaluation Results

MethodLinks
2026.05
73.4-0.294
2026.05
71.5-0.627
2026.05
71.2-0.33
2026.05
67.5-0.447
2026.05
63.4-0.73
2026.05
60.2-0.473
2026.05
60-0.235
2026.05
57.3-0.13
2026.05
56.8-0.096
2026.05
56.6-0.208
2026.05
56.2-0.106
2026.05
56.2-0.589
2026.05
54.9-0.913
2026.05
54.1-0.304
2026.05
54.1-0.479
2026.05
53.8-0.144
2026.05
47-0.325
2026.05
45.2-0.334
2026.05
42.5-0.849
2026.05
41.6-0.563
2026.05
38.3-0.427
2026.05
37.6-0.426
2026.05
32.6-0.966
2026.05
32-0.741