Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Tool Calling on ToolBench Generalization (I1-Tool)
Loading...
57.7
SoPR
ToolLlama*
27.3112
35.2006
43.09
50.9794
Jan 29, 2026
SoPR
SoWR
Updated 4d ago
Evaluation Results
Method
Method
Links
SoPR
SoWR
ToolLlama*
Setting=Retrieval
2026.01
57.7
48.73
GPT-3.5*
Setting=Retrieval
2026.01
57.59
46.2
ToolGen*
2026.01
56.54
40.51
ToolWeaver
2026.01
54.85
36.08
GPT-4o-mini
Setting=Retrieval
2026.01
53.16
49.37
ToolGen
2026.01
45.36
32.91
ToolLlama-2
Setting=Retrieval
2026.01
28.48
26.58
Feedback
Search any
task
Search any
task