Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Tool-use instruction following on ToolBench Average
Loading...
71.1
Pass Rate
DFSDT
-2.844
16.353
35.55
54.747
Jul 31, 2023
Pass Rate
Win Rate
Updated 4d ago
Evaluation Results
Method
Method
Links
Pass Rate
Win Rate
DFSDT
Backbone=GPT-4, Retrie...
2023.07
71.1
70.4
DFSDT
Backbone=ToolLLaMA, Re...
2023.07
67.3
63.1
DFSDT
Backbone=ToolLLaMA, Re...
2023.07
66.7
60
DFSDT
Backbone=ChatGPT, Retr...
2023.07
64.8
64.3
ReACT
Backbone=GPT-4, Retrie...
2023.07
57.2
64.4
DFSDT
Backbone=Text-Davinci-...
2023.07
43.1
46.3
ReACT
Backbone=ChatGPT, Retr...
2023.07
40.2
-
ReACT
Backbone=ToolLLaMA, Re...
2023.07
29
47
DFSDT
Backbone=Claude-2, Ret...
2023.07
22.6
43.5
ReACT
Backbone=Text-Davinci-...
2023.07
16.5
33.2
ReACT
Backbone=Claude-2, Ret...
2023.07
6.8
34.4
ReACT & DFSDT
Backbone=Vicuna, Retri...
2023.07
0
0
ReACT & DFSDT
Backbone=Alpaca, Retri...
2023.07
0
0
Feedback
Search any
task
Search any
task