Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Code on MBPP 1,000-example (test)
Loading...
9.0212
Perplexity
Qwen3-VL-2B-Instruct
8.52388
11.88079
15.2377
18.59461
Jan 13, 2026
Perplexity
Delta Score
Updated 4d ago
Evaluation Results
Method
Method
Links
Perplexity
Delta Score
Qwen3-VL-2B-Instruct
Ablation State=SRA, Ev...
2026.01
9.0212
-0.2403
Qwen3-VL-2B-Instruct
Ablation State=Base, E...
2026.01
9.2615
-
Ministral-14B-Instruct-2512
Ablation State=SRA, Ev...
2026.01
11.1405
-0.3506
Ministral-14B-Instruct-2512
Ablation State=Base, E...
2026.01
11.4911
-
Qwen3-VL-8B-Instruct
Ablation State=SRA, Ev...
2026.01
11.8694
-0.2965
Qwen3-VL-8B-Instruct
Ablation State=Base, E...
2026.01
12.1659
-
Qwen3-VL-4B-Instruct
Ablation State=SRA, Ev...
2026.01
13.5447
-0.2597
Qwen3-VL-4B-Instruct
Ablation State=Base, E...
2026.01
13.8044
-
Ministral-3B-Instruct-2512
Ablation State=SRA, Ev...
2026.01
20.3751
-1.0791
Ministral-3B-Instruct-2512
Ablation State=Base, E...
2026.01
21.4542
-
Feedback
Search any
task
Search any
task