Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
First-order logical inference on FOLIO-wiki (testable)
Loading...
87.45
Accuracy
CR
31.1652
45.7776
60.39
75.0024
Aug 8, 2023
Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
CR
Model=GPT-4, n=2
2023.08
87.45
CoT-SC
Model=GPT-4, k=16
2023.08
85.02
CoT
Model=GPT-4
2023.08
84.46
Direct
Model=GPT-4
2023.08
80.52
CR
Model=GPT-3.5-turbo, n=2
2023.08
73.03
CR
Model=LLaMA-65B, n=2
2023.08
72.1
CoT-SC
Model=LLaMA-65B, k=16
2023.08
70.79
Direct
Model=LLaMA-65B
2023.08
67.42
CoT
Model=LLaMA-65B
2023.08
67.42
CoT
Model=GPT-3.5-turbo
2023.08
64.61
CoT-SC
Model=GPT-3.5-turbo, k=16
2023.08
63.33
Direct
Model=GPT-3.5-turbo
2023.08
62.92
CR
Model=LLaMA 13B, n=2
2023.08
53.37
CoT-SC
Model=LLaMA 13B, k=16
2023.08
52.43
CoT
Model=LLaMA 13B
2023.08
49.06
Direct
Model=LLaMA 13B
2023.08
44.75
[Random]
2023.08
33.33
Feedback
Search any
task
Search any
task