Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Multi-hop Tool-use on ToolHop unseen (test)
Loading...
43.1
Accuracy
RimRule
34.78
36.94
39.1
41.26
Dec 31, 2025
Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
RimRule
Adaptation Method=Recu...
2025.12
43.1
Few-shot
Adaptation Method=Few-...
2025.12
37.9
SEE
Adaptation Method=Self...
2025.12
35.9
Zero-shot
Adaptation Method=Zero...
2025.12
35.1
Feedback
Search any
task
Search any
task