Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Multi-hop Reasoning on CommaQA-E (test)
Loading...
70
Exact Match
ChatGPT (SKiC)
9.68
25.34
41
56.66
Aug 1, 2023
Exact Match
Updated 2d ago
Evaluation Results
Method
Method
Links
Exact Match
ChatGPT (SKiC)
Model=ChatGPT, Prompti...
2023.08
70
text-davinci-003 (SKiC)
Model=text-davinci-003...
2023.08
66
ChatGPT (Decomp)
Model=ChatGPT, Prompti...
2023.08
64
text-davinci-003 (Decomp)
Model=text-davinci-003...
2023.08
58
ChatGPT (CoT)
Model=ChatGPT, Prompti...
2023.08
55
ChatGPT (4-shots)
Model=ChatGPT, Prompti...
2023.08
47
LLAMA-65B (SKiC)
Model=LLAMA-65B, Promp...
2023.08
44
text-davinci-003 (CoT)
Model=text-davinci-003...
2023.08
44
text-davinci-003 (4-shots)
Model=text-davinci-003...
2023.08
42
ChatGPT (zero-shot)
Model=ChatGPT, Prompti...
2023.08
42
text-davinci-003 (zero-shot)
Model=text-davinci-003...
2023.08
34
LLAMA-65B (Decomp)
Model=LLAMA-65B, Promp...
2023.08
32
LLAMA-65B (CoT)
Model=LLAMA-65B, Promp...
2023.08
27
LLAMA-65B (4-shots)
Model=LLAMA-65B, Promp...
2023.08
15
LLAMA-65B (zero-shot)
Model=LLAMA-65B, Promp...
2023.08
12
Feedback
Search any
task
Search any
task