Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Deductive logical reasoning on ProofWriter (test)
Loading...
100
ExcRate
SFT+
34.1368
51.2359
68.335
85.4341
Jan 14, 2026
ExcRate
Accuracy
Updated 3d ago
Evaluation Results
Method
Method
Links
ExcRate
Accuracy
SFT+
Model=Phi-4-mini-Instr...
2026.01
100
97.83
Increment
Model=Phi-4-mini-Instr...
2026.01
100
97.83
SFT+
Model=Gemma-3-4B-Instr...
2026.01
98.83
96.83
Increment
Model=Gemma-3-4B-Instr...
2026.01
98.5
96.5
SFT+
Model=Qwen2.5-3B-Instr...
2026.01
96.83
93.33
Increment
Model=Qwen2.5-3B-Instr...
2026.01
96.5
93
Increment
Model=Qwen3-4B-Instruc...
2026.01
96.5
94.17
SFT+
Model=Qwen3-4B-Instruc...
2026.01
95
92.83
ICL
Model=Phi-4-mini-Instr...
2026.01
87
68.67
ICL
Model=Gemma-3-4B-Instr...
2026.01
74.5
58.33
ICL
Model=Qwen3-4B-Instruc...
2026.01
42.67
35.5
ICL
Model=Qwen2.5-3B-Instr...
2026.01
36.67
25.5
Feedback
Search any
task
Search any
task