Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Exact Match on HotPotQA (Multi-hop Question Answering)
Loading...
36
Exact Match (HotPotQA)
ReAct
0.64
9.82
19
28.18
Apr 5, 2026
Exact Match (HotPotQA)
Updated 12d ago
Evaluation Results
Method
Method
Links
Exact Match (HotPotQA)
ReAct
Model=GPT-5.4
2026.04
36
ReAct
Model=Claude Haiku 4.5
2026.04
26.5
ReAct
Model=Claude Sonnet 4.6
2026.04
16.7
PTR
Model=GPT-5.4
2026.04
12.2
PTR
Model=GPT-4o-Mini
2026.04
12
ReAct
Model=GPT-4o-Mini
2026.04
8
PTR
Model=Claude Sonnet 4.6
2026.04
8
PTR
Model=Claude Haiku 4.5
2026.04
2
Feedback
Search any
task
Search any
task