Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Multi-Hop Question Answering on 2Wiki (out-of-domain)
Loading...
42
Accuracy
EvolveR
21.408
26.754
32.1
37.446
Feb 9, 2026
Accuracy
Updated 2d ago
Evaluation Results
Method
Method
Links
Accuracy
EvolveR
Backbone=Qwen2.5-7B-In...
2026.02
42
SKILLRL
Backbone=Qwen2.5-7B-In...
2026.02
40.3
Search-R1
Backbone=Qwen2.5-7B-In...
2026.02
40.1
StepSearch
Backbone=Qwen2.5-7B-In...
2026.02
36.6
ZeroSearch
Backbone=Qwen2.5-7B-In...
2026.02
35.2
R1-Instruct
Backbone=Qwen2.5-7B-In...
2026.02
27.5
Search-o1
Backbone=Qwen2.5-7B-In...
2026.02
27
RAG
Backbone=Qwen2.5-7B-In...
2026.02
23.2
CoT
Backbone=Qwen2.5-7B-In...
2026.02
22.6
Qwen2.5
Backbone=Qwen2.5-7B-In...
2026.02
22.2
Feedback
Search any
task
Search any
task