Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Deductive logical reasoning on FOLIO 203 (dev)
Loading...
6.4
Exclusion Rate
ICL
3.858
21.0165
38.175
55.3335
Jan 14, 2026
Exclusion Rate
Accuracy
Updated 3d ago
Evaluation Results
Method
Method
Links
Exclusion Rate
Accuracy
ICL
Model=Qwen3-4B-Instruc...
2026.01
6.4
3.45
ICL
Model=Qwen2.5-3B-Instr...
2026.01
27.59
19.21
ICL
Model=Phi-4-mini-Instr...
2026.01
45.32
28.08
Increment
Model=Qwen2.5-3B-Instr...
2026.01
55.17
30.54
SFT+
Model=Qwen2.5-3B-Instr...
2026.01
57.63
30.05
ICL
Model=Gemma-3-4B-Instr...
2026.01
58.13
33
Increment
Model=Qwen3-4B-Instruc...
2026.01
58.13
38.42
SFT+
Model=Qwen3-4B-Instruc...
2026.01
60.1
40.39
SFT+
Model=Phi-4-mini-Instr...
2026.01
61.08
38.92
Increment
Model=Phi-4-mini-Instr...
2026.01
62.07
38.92
SFT+
Model=Gemma-3-4B-Instr...
2026.01
66
36.95
Increment
Model=Gemma-3-4B-Instr...
2026.01
69.95
40.89
Feedback
Search any
task
Search any
task