Share your thoughts, 1 month free Claude Pro on usSee more

Code Generation on HumanEval compile (L3)

14.71Pass@1

ShieldedCode

Updated 5mo ago

Evaluation Results

Method	Links
ShieldedCode 2026.01		14.71	22.83
GPT-4o 2026.01		11.89	18.99
DeepSeekCoder-7B 2026.01		6.17	8.9
Meta LLMCompiler-7B 2026.01		5.39	7
StarCoder2-7B 2026.01		5.32	6.2
Qwen-2.5-Coder-7B 2026.01		4.89	6.06
GPT-3.5-Turbo 2026.01		4.29	4.41
CodeLlama 2026.01		2.79	4.56