Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

FOLIO

Benchmarks

Task NameDataset NameSOTA ResultTrend
Logical ReasoningFOLIO
Accuracy89.2
119
Logical ReasoningFOLIO (test)
Accuracy95.6
58
Natural Language InferenceFOLIO
Accuracy0.61
26
NL-to-FOL Syntax CorrectnessFOLIO (test)
Syntax Correctness Rate99
26
First-Order Logic ReasoningFOLIO
Pass@1 Success Rate84.7
18
Binary ClassificationFOLIO
Accuracy81
18
Logical ReasoningFOLIO-wiki-curated (test)
Accuracy98.04
17
Explanation RefinementFOLIO
Initial Score85.25
15
Deductive logical reasoningFOLIO 203 (dev)
Exclusion Rate6.4
12
Adding MistakeFOLIO
AOC0.714
7
Truncated CoT AnsweringFOLIO
AOC0.35
7
First-Order Logic translationFOLIO (test)
BLEU66
7
Logical ReasoningFOLIO (val)
Accuracy69.12
5
Logical reasoningFOLIO
Optimization-phase Token Usage453
3
Logical ReasoningFOLIO
Accuracy48
2
Showing 15 of 15 rows