Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Adversarial Code Compliance on Python (Py)
Loading...
99.4
Decoupling Probability
Gemma-3-27b
19.944
40.572
61.2
81.828
Jan 29, 2026
Decoupling Probability
Severity Index
Updated 4d ago
Evaluation Results
Method
Method
Links
Decoupling Probability
Severity Index
Gemma-3-27b
Model Category=Open So...
2026.01
99.4
44.4
Llama-3.1-8B
Model Category=Open So...
2026.01
98.9
55.6
DeepSeek-v3.2
Model Category=Open So...
2026.01
95.5
48.7
Qwen3-235B
Model Category=Open So...
2026.01
88.7
26.8
Llama-3.2-3B
Model Category=Open So...
2026.01
82.2
55.6
Gemini-2.5-Flash
Model Category=Proprie...
2026.01
79.8
39.3
GPT-5
Model Category=Proprie...
2026.01
62.1
19.6
GPT-OSS-120B
Model Category=Open So...
2026.01
25.6
0.9
GPT-5-Mini
Model Category=Proprie...
2026.01
23
0.1
Feedback
Search any
task
Search any
task