Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Tool-use instruction following on ToolBench I2-Instruction
Loading...
81.5
Pass Rate
DFSDT
-3.26
18.745
40.75
62.755
Jul 31, 2023
Pass Rate
Win Rate
Updated 4d ago
Evaluation Results
Method
Method
Links
Pass Rate
Win Rate
DFSDT
Backbone=ToolLLaMA, Re...
2023.07
81.5
68.5
DFSDT
Backbone=GPT-4, Retrie...
2023.07
79.5
73.3
DFSDT
Backbone=ToolLLaMA, Re...
2023.07
77
68.5
DFSDT
Backbone=ChatGPT, Retr...
2023.07
75
72
ReACT
Backbone=GPT-4, Retrie...
2023.07
67
65.8
ReACT
Backbone=ChatGPT, Retr...
2023.07
42.5
-
DFSDT
Backbone=Text-Davinci-...
2023.07
37
40.5
ReACT
Backbone=ToolLLaMA, Re...
2023.07
30.5
50.8
DFSDT
Backbone=Claude-2, Ret...
2023.07
17
36.8
ReACT
Backbone=Text-Davinci-...
2023.07
8.5
29.8
ReACT
Backbone=Claude-2, Ret...
2023.07
6
35
ReACT & DFSDT
Backbone=Vicuna, Retri...
2023.07
0
0
ReACT & DFSDT
Backbone=Alpaca, Retri...
2023.07
0
0
Feedback
Search any
task
Search any
task