Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Tool-use Planning on ToolBench G2-Category
Loading...
78.78
Win Rate
GPT4 TOPGUN
40.3208
50.3054
60.29
70.2746
Feb 15, 2024
Win Rate
Updated 4d ago
Evaluation Results
Method
Method
Links
Win Rate
GPT4 TOPGUN
Reference=ChatGPT REACT
2024.02
78.78
GPT4 TOPGUN
Reference=T.LLAMA REACT
2024.02
77.71
GPT4 TOPGUN
Reference=GPT4 ReACT
2024.02
73.75
GPT4 TOPGUN
Reference=ChatGPT DFSDT
2024.02
73.07
GPT4 TOPGUN
Reference=T.LLAMA DFSD...
2024.02
72.92
GPT4 TOPGUN
Reference=T.LLAMA DFSDT
2024.02
71.8
GPT4 TOPGUN
Reference=GPT4 DFSDT
2024.02
71.35
ChatGPT DFSDT
Reference=ChatGPT REACT
2024.02
64.8
GPT4 DFSDT
Reference=ChatGPT REACT
2024.02
63.3
T.LLAMA DFSDT+Ret
Reference=ChatGPT REACT
2024.02
60.8
GPT4 ReACT
Reference=ChatGPT REACT
2024.02
60.3
T.LLAMA DFSDT
Reference=ChatGPT REACT
2024.02
58
T.LLAMA REACT
Reference=ChatGPT REACT
2024.02
41.8
Feedback
Search any
task
Search any
task