Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Multi-hop reasoning on Multi-hop reasoning tasks T2 L ≈ 9 steps
Loading...
79
API Success Rate
ITR
63.4
67.45
71.5
75.55
Dec 1, 2025
API Success Rate
Episode Cost ($)
p50 Latency (s)
Updated 4d ago
Evaluation Results
Method
Method
Links
API Success Rate
Episode Cost ($)
p50 Latency (s)
ITR
2025.12
79
0.86
44
B2 Prompt-RAG
2025.12
72
1.45
51
B1 Router-Only
2025.12
70
2.55
65
B0 Monolithic
2025.12
64
2.9
68
Feedback
Search any
task
Search any
task