Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Question Answering on HotpotQA (LLM Accuracy and EM)
Loading...
36.3
Exact Match (EM)
LUMOS-IQA
21.74
25.52
29.3
33.08
Nov 9, 2023
Exact Match (EM)
Accuracy (LLM)
Updated 4d ago
Evaluation Results
Method
Method
Links
Exact Match (EM)
Accuracy (LLM)
LUMOS-IQA
Agent Model=LLAMA-2-13...
2023.11
36.3
57.4
LUMOS-IQA
Agent Model=LLAMA-2-7B...
2023.11
36
56.8
ReAcT
Agent Model=GPT-3.5-tu...
2023.11
32.4
40.8
LUMOS-IQA
Agent Model=LLAMA-2-13...
2023.11
31.4
50.2
ReWOO
Agent Model=GPT-3.5-tu...
2023.11
30.4
42.4
LUMOS-IQA
Agent Model=LLAMA-2-7B...
2023.11
29.4
45.9
FiReAct
Agent Model=CodeLLAMA-...
2023.11
27.8
-
FiReAct
Agent Model=LLAMA-2-7B...
2023.11
26.2
-
LUMOS-OQA
Agent Model=LLAMA-2-7B...
2023.11
24.9
39.2
LUMOS-IQA
Agent Model=LLAMA-2-7B...
2023.11
23.5
37.3
GPT-3.5-CoT
Agent Model=GPT-3.5-tu...
2023.11
22.4
37.8
AgentLM
Agent Model=LLAMA-2-7B...
2023.11
22.3
-
ReWOO-open
Agent Model=LLAMA-7B,...
2023.11
-
37
Feedback
Search any
task
Search any
task