Share your thoughts, 1 month free Claude Pro on usSee more

Code Reasoning on HumanEval base and extended (out-of-distribution)

0.677Accuracy

InftyThink+

Updated 4mo ago

Evaluation Results

Method	Links
InftyThink+ 2026.02		0.677	0.626	8.21	42.22
InftyThink+ 2026.02		0.6742	0.6191	4.66	23.9
Vanilla 2026.02		0.6044	0.5627	8.17	90.1
InftyThink+ 2026.02		0.5903	0.5434	5.02	27.5
Vanilla 2026.02		0.5703	0.5252	6.58	65.89