Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Multi-hop Question Answering on HotpotQA subset of 100 questions
Loading...
71
EM
LATS
30.44
40.97
51.5
62.03
Oct 6, 2023
EM
Updated 4d ago
Evaluation Results
Method
Method
Links
EM
LATS
LLM=GPT-3.5, Prompt Me...
2023.10
71
LATS
LLM=GPT-3.5, Prompt St...
2023.10
65
LATS
LLM=GPT-3.5, Prompt Me...
2023.10
63
LATS
Backbone=GPT-3.5, Prom...
2023.10
62
RAP
Backbone=GPT-3.5, k=50
2023.10
60
RAP
Backbone=GPT-3.5, n=10
2023.10
60
LATS
LLM=GPT-3.5, Prompt St...
2023.10
58
ToT
Backbone=GPT-3.5, Sear...
2023.10
55
RAP
LLM=GPT-3.5, Prompt Me...
2023.10
54
Reflexion
LLM=GPT-3.5, Prompt St...
2023.10
51
ToT
LLM=GPT-3.5, Prompt Me...
2023.10
39
CoT-SC
Backbone=GPT-3.5
2023.10
38
ReAct
LLM=GPT-3.5, Prompt St...
2023.10
38
CoT
Backbone=GPT-3.5
2023.10
34
Base LM
Backbone=GPT-3.5
2023.10
32
ReAct
LLM=GPT-3.5, Prompt St...
2023.10
32
Feedback
Search any
task
Search any
task