Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Code Reasoning on HumanEval base and extended (out-of-distribution)

0.677Accuracy

InftyThink+

0.5660320.5948410.623650.652459Feb 6, 2026
Updated 4d ago

Evaluation Results

MethodLinks
2026.02
0.6770.6268.2142.22
2026.02
0.67420.61914.6623.9
2026.02
0.60440.56278.1790.1
2026.02
0.59030.54345.0227.5
2026.02
0.57030.52526.5865.89