Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Adversarial Code Compliance on C
Loading...
100
Decoupling Probability
Llama-3.1-8B
24.6
44.175
63.75
83.325
Jan 29, 2026
Decoupling Probability
Severity Index
Updated 4d ago
Evaluation Results
Method
Method
Links
Decoupling Probability
Severity Index
Llama-3.1-8B
Model Category=Open So...
2026.01
100
52.7
DeepSeek-v3.2
Model Category=Open So...
2026.01
97.5
45.9
Qwen3-235B
Model Category=Open So...
2026.01
87.6
24.7
Gemma-3-27b
Model Category=Open So...
2026.01
82.6
24.3
Gemini-2.5-Flash
Model Category=Proprie...
2026.01
79.8
35.2
GPT-5
Model Category=Proprie...
2026.01
73.2
26.5
Llama-3.2-3B
Model Category=Open So...
2026.01
66.5
14.3
GPT-OSS-120B
Model Category=Open So...
2026.01
38.5
2
GPT-5-Mini
Model Category=Proprie...
2026.01
27.5
0.6
Feedback
Search any
task
Search any
task