Share your thoughts, 1 month free Claude Pro on usSee more

Code Generation on HumanEval compile (L1)

0.1847Pass@1

ShieldedCode

Updated 5mo ago

Evaluation Results

Method	Links
ShieldedCode 2026.01		0.1847	0.2794
GPT-4o 2026.01		0.1743	0.2518
DeepSeekCoder-7B 2026.01		0.0689	0.1065
GPT-3.5-Turbo 2026.01		0.0571	0.0795
Meta LLMCompiler-7B 2026.01		0.0538	0.0689
Qwen-2.5-Coder-7B 2026.01		0.0512	0.0728
StarCoder2-7B 2026.01		0.0491	0.0473
CodeLlama 2026.01		0.0326	0.0534