Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Deductive logical reasoning on ProntoQA (test)
Loading...
2.8
Error Rate
ICL
-0.312
20.694
41.7
62.706
Jan 14, 2026
Error Rate
Accuracy
Updated 3d ago
Evaluation Results
Method
Method
Links
Error Rate
Accuracy
ICL
Model=Qwen3-4B-Instruc...
2026.01
2.8
1.2
ICL
Model=Qwen2.5-3B-Instr...
2026.01
4.4
0.4
ICL
Model=Gemma-3-4B-Instr...
2026.01
5.6
0.8
ICL
Model=Phi-4-mini-Instr...
2026.01
6.4
2.6
SFT+
Model=Phi-4-mini-Instr...
2026.01
44.6
30.6
Increment
Model=Gemma-3-4B-Instr...
2026.01
50.8
23.6
SFT+
Model=Gemma-3-4B-Instr...
2026.01
51.8
23.2
Increment
Model=Phi-4-mini-Instr...
2026.01
52.4
34.8
Increment
Model=Qwen2.5-3B-Instr...
2026.01
54.4
19.2
SFT+
Model=Qwen2.5-3B-Instr...
2026.01
55.6
21.6
SFT+
Model=Qwen3-4B-Instruc...
2026.01
62.2
51.6
Increment
Model=Qwen3-4B-Instruc...
2026.01
80.6
64.4
Feedback
Search any
task
Search any
task