Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Multi-Hop Question Answering on Bamboogle (out-of-domain)
Loading...
73.8
Accuracy
SKILLRL
12.024
28.062
44.1
60.138
Feb 9, 2026
Accuracy
Updated 2d ago
Evaluation Results
Method
Method
Links
Accuracy
SKILLRL
Backbone=Qwen2.5-7B-In...
2026.02
73.8
EvolveR
Backbone=Qwen2.5-7B-In...
2026.02
54.4
StepSearch
Backbone=Qwen2.5-7B-In...
2026.02
40
Search-R1
Backbone=Qwen2.5-7B-In...
2026.02
36.8
Search-o1
Backbone=Qwen2.5-7B-In...
2026.02
30.4
ZeroSearch
Backbone=Qwen2.5-7B-In...
2026.02
27.8
CoT
Backbone=Qwen2.5-7B-In...
2026.02
24
R1-Instruct
Backbone=Qwen2.5-7B-In...
2026.02
19.2
RAG
Backbone=Qwen2.5-7B-In...
2026.02
16.8
Qwen2.5
Backbone=Qwen2.5-7B-In...
2026.02
14.4
Feedback
Search any
task
Search any
task