Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
End-to-end Tool-use on ToolBench I2-Cat v1
Loading...
46.24
SoPR
ToolWeaver
18.004
25.3345
32.665
39.9955
Jan 29, 2026
SoPR
SoWR
Updated 4d ago
Evaluation Results
Method
Method
Links
SoPR
SoWR
ToolWeaver
Setting=Direct Generation
2026.01
46.24
35.48
ToolGen
Setting=Direct Generation
2026.01
45.56
37.9
GPT-4o-mini
Setting=Retrieval (Re.)
2026.01
39.38
-
ToolLlama-2
Setting=Retrieval (Re.)
2026.01
19.09
20.16
Feedback
Search any
task
Search any
task