Share your thoughts, 1 month free Claude Pro on usSee more

Coding Reasoning on HumanEval

96.34Accuracy (%)

COPT

Updated 1mo ago

Evaluation Results

Method	Links
COPT 2026.05		96.34	1,842	-
COPT 2026.05		94.51	1,023	-
CoT (Greedy) 2026.05		93.9	2,627	-
ASAG 2026.06		93.9	1,025	32.1
ASAG 2026.06		93.9	1,126	35
Vanilla 2026.06		93.3	3,216	100
ASAG 2026.06		92.7	1,249	32.7
ASAG 2026.06		92.7	1,002	26.3
Vanilla 2026.06		92.7	3,198	100
CoT 2026.05		92.68	2,368	-
Vanilla 2026.06		91.5	3,821	100
Vanilla 2026.06		90.2	3,812	100
ASAG 2026.06		82.3	1,465	23.6
Vanilla 2026.06		80.5	6,214	100
Vanilla 2026.06		78.6	5,739	100
ASAG 2026.06		78.6	1,257	21.9