Share your thoughts, 1 month free Claude Pro on usSee more

Code Generation on HumanEval (pass@2, pass@4, pass@8)

59.15Pass@2

DLE (top-p+top-k)

Updated 3mo ago

Evaluation Results

Method	Links
DLE (top-p+top-k) 2026.04		59.15	67.07	76.83
DLE (min-p) 2026.04		59.15	68.29	76.83
DLE (ε-sampling)-PROBFIRST 2026.04		59.15	69.51	78.66
DLE (ε-sampling)-RANDBRANCH 2026.04		57.93	65.24	73.17
Self-consistency (ε-sampling) 2026.04		55.49	64.46	75.61
DLE (ε-sampling)-DIVFIRST 2026.04		55.49	65.24	71.95
Self-consistency (min-p) 2026.04		51.83	65.85	76.83
Self-consistency (top-p+top-k) 2026.04		49.39	64.02	73.78
Self-consistency 2026.04		46.34	57.93	70.12