Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Deductive logical reasoning on FOLIO 203 (dev)
Loading...
6.4
Exclusion Rate
ICL
3.858
21.0165
38.175
55.3335
Jan 14, 2026
Exclusion Rate
Accuracy
Updated 1mo ago
Evaluation Results
Method
Method
Links
Exclusion Rate
Accuracy
ICL
Model=Qwen3-4B-Instruc...
2026.01
6.4
3.45
ICL
Model=Qwen2.5-3B-Instr...
2026.01
27.59
19.21
ICL
Model=Phi-4-mini-Instr...
2026.01
45.32
28.08
Increment
Model=Qwen2.5-3B-Instr...
2026.01
55.17
30.54
SFT+
Model=Qwen2.5-3B-Instr...
2026.01
57.63
30.05
ICL
Model=Gemma-3-4B-Instr...
2026.01
58.13
33
Increment
Model=Qwen3-4B-Instruc...
2026.01
58.13
38.42
SFT+
Model=Qwen3-4B-Instruc...
2026.01
60.1
40.39
SFT+
Model=Phi-4-mini-Instr...
2026.01
61.08
38.92
Increment
Model=Phi-4-mini-Instr...
2026.01
62.07
38.92
SFT+
Model=Gemma-3-4B-Instr...
2026.01
66
36.95
Increment
Model=Gemma-3-4B-Instr...
2026.01
69.95
40.89
Feedback
Search any
task
Search any
task