Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Multi-hop Question Answering on HotpotQA (Correctness)
Loading...
62.2
Correctness
ProTeGi
32.456
40.178
47.9
55.622
May 19, 2026
Correctness
Updated 14d ago
Evaluation Results
Method
Method
Links
Correctness
ProTeGi
Evaluation Backbone=Cl...
2026.05
62.2
GEPA
Evaluation Backbone=Cl...
2026.05
60.2
MOCHA
Evaluation Backbone=Cl...
2026.05
60
TextGrad
Evaluation Backbone=Cl...
2026.05
59.2
Seed Skill
Evaluation Backbone=Cl...
2026.05
33.6
Feedback
Search any
task
Search any
task