Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Tool-use Planning on ToolBench Average over all sets
Loading...
86.54
Win Rate
GPT4 TOPGUN
45.4184
56.0942
66.77
77.4458
Feb 15, 2024
Win Rate
Updated 4d ago
Evaluation Results
Method
Method
Links
Win Rate
GPT4 TOPGUN
Reference=ChatGPT REACT
2024.02
86.54
GPT4 TOPGUN
Reference=T.LLAMA REACT
2024.02
84.61
GPT4 TOPGUN
Reference=GPT4 ReACT
2024.02
80.27
GPT4 TOPGUN
Reference=T.LLAMA DFSD...
2024.02
79.44
GPT4 TOPGUN
Reference=ChatGPT DFSDT
2024.02
78.71
GPT4 TOPGUN
Reference=GPT4 DFSDT
2024.02
78.59
GPT4 TOPGUN
Reference=T.LLAMA DFSDT
2024.02
78.44
GPT4 DFSDT
Reference=ChatGPT REACT
2024.02
70.4
ChatGPT DFSDT
Reference=ChatGPT REACT
2024.02
64.3
GPT4 ReACT
Reference=ChatGPT REACT
2024.02
64
T.LLAMA DFSDT+Ret
Reference=ChatGPT REACT
2024.02
63.1
T.LLAMA DFSDT
Reference=ChatGPT REACT
2024.02
60
T.LLAMA REACT
Reference=ChatGPT REACT
2024.02
47
Feedback
Search any
task
Search any
task