Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Kernel Generation on KernelBench Level 1
Loading...
34
Correctness Rate (Round 1)
Codex
8
14.75
21.5
28.25
Mar 11, 2026
Correctness Rate (Round 1)
Correctness Rate (Final)
Accuracy (Round 1)
Accuracy (Final)
Updated 1mo ago
Evaluation Results
Method
Method
Links
Correctness Rate (Round 1)
Correctness Rate (Final)
Accuracy (Round 1)
Accuracy (Final)
Codex
Model=GPT-5.2
2026.03
34
82
16
70
EvoKernel
Model=Qwen3-Coder-30B
2026.03
25
33
6
11
Pass@k
Model=GPT-5.2
2026.03
24
36
9
19
Pass@k
Model=Qwen3-Coder-30B
2026.03
22
30
7
8
Pass@k
Model=DeepSeek-V3.2
2026.03
21
33
7
9
EvoKernel
Model=GPT-5.2
2026.03
20
97
7
90
Refinement
Model=GPT-5.2
2026.03
19
88
7
41
Refinement
Model=DeepSeek-V3.2
2026.03
16
44
0
12
Refinement
Model=Qwen3-Coder-30B
2026.03
13
22
2
6
EvoKernel
Model=DeepSeek-V3.2
2026.03
9
39
2
19
Feedback
Search any
task
Search any
task