Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Logical Reasoning on FOLIO-wiki-curated (test)
Loading...
98.04
Accuracy
CR
30.7416
48.2133
65.685
83.1567
Aug 8, 2023
Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
CR
Model=GPT-4, n=2
2023.08
98.04
CoT-SC
Model=GPT-4, k=16
2023.08
96.09
CoT
Model=GPT-4
2023.08
95
Direct
Model=GPT-4
2023.08
89.57
CR
Model=LLaMA-65B, n=2
2023.08
79.57
CoT-SC
Model=LLaMA-65B, k=16
2023.08
79.13
CR
Model=GPT-3.5-turbo, n=2
2023.08
78.7
Direct
Model=LLaMA-65B
2023.08
74.78
CoT
Model=LLaMA-65B
2023.08
74.13
CoT
Model=GPT-3.5-turbo
2023.08
70.65
Direct
Model=GPT-3.5-turbo
2023.08
69.57
CoT-SC
Model=GPT-3.5-turbo, k=16
2023.08
69.32
CR
Model=LLaMA 13B, n=2
2023.08
55.87
CoT-SC
Model=LLaMA 13B, k=16
2023.08
53.7
CoT
Model=LLaMA 13B
2023.08
52.17
Direct
Model=LLaMA 13B
2023.08
49.13
[Random]
Model=Baseline
2023.08
33.33
Feedback
Search any
task
Search any
task