Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Abductive Reasoning on FOLIO
Loading...
88
Accuracy (FOLIO)
PACS
62
68.75
75.5
82.25
May 8, 2026
Accuracy (FOLIO)
Updated 23d ago
Evaluation Results
Method
Method
Links
Accuracy (FOLIO)
PACS
Backbone=Llama 3.3 70B
2026.05
88
PACS
Backbone=Llama 3-Instr...
2026.05
82
ARGOS
Backbone=Llama 3.3 70B
2026.05
80
ARGOS
Backbone=Llama 3-Instr...
2026.05
78
If-Beam
Backbone=Llama 3.3 70B
2026.05
78
SC-20
Backbone=Llama 3.3 70B
2026.05
77
If-Beam
Backbone=Llama 3-Instr...
2026.05
74
COT
Backbone=Llama 3.3 70B
2026.05
73
SC-20
Backbone=Llama 3-Instr...
2026.05
71
LoT
Backbone=Llama 3.3 70B
2026.05
70
LoT
Backbone=Llama 3-Instr...
2026.05
69
COT
Backbone=Llama 3-Instr...
2026.05
68
LLM-Tree
Backbone=Llama 3-Instr...
2026.05
63
LLM-Tree
Backbone=Llama 3.3 70B
2026.05
63
Feedback
Search any
task
Search any
task