Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Tool-use instruction following on ToolBench I1-Tool
Loading...
71.5
Pass Rate
DFSDT
-2.86
16.445
35.75
55.055
Jul 31, 2023
Pass Rate
Win Rate
Updated 4d ago
Evaluation Results
Method
Method
Links
Pass Rate
Win Rate
DFSDT
Backbone=GPT-4, Retrie...
2023.07
71.5
67.8
DFSDT
Backbone=ChatGPT, Retr...
2023.07
65
62
DFSDT
Backbone=ToolLLaMA, Re...
2023.07
64
59
DFSDT
Backbone=ToolLLaMA, Re...
2023.07
61
55.3
ReACT
Backbone=GPT-4, Retrie...
2023.07
50
58.8
ReACT
Backbone=ChatGPT, Retr...
2023.07
44
-
DFSDT
Backbone=Text-Davinci-...
2023.07
44
43.8
DFSDT
Backbone=Claude-2, Ret...
2023.07
31
44.3
ReACT
Backbone=ToolLLaMA, Re...
2023.07
29
42
ReACT
Backbone=Text-Davinci-...
2023.07
20
35.3
ReACT
Backbone=Claude-2, Ret...
2023.07
3.5
27.8
ReACT & DFSDT
Backbone=Vicuna, Retri...
2023.07
0
0
ReACT & DFSDT
Backbone=Alpaca, Retri...
2023.07
0
0
Feedback
Search any
task
Search any
task