Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Tool-use instruction following on ToolBench I1-Instruction
Loading...
64
Pass Rate
DFSDT
-2.56
14.72
32
49.28
Jul 31, 2023
Pass Rate
Win Rate
Updated 4d ago
Evaluation Results
Method
Method
Links
Pass Rate
Win Rate
DFSDT
Backbone=ToolLLaMA, Re...
2023.07
64
62.3
DFSDT
Backbone=GPT-4, Retrie...
2023.07
60
67.5
DFSDT
Backbone=ToolLLaMA, Re...
2023.07
57
55
DFSDT
Backbone=ChatGPT, Retr...
2023.07
54.5
60.5
ReACT
Backbone=GPT-4, Retrie...
2023.07
53.5
60
DFSDT
Backbone=Text-Davinci-...
2023.07
43.5
40.3
ReACT
Backbone=ChatGPT, Retr...
2023.07
41.5
-
ReACT
Backbone=ToolLLaMA, Re...
2023.07
25
45
DFSDT
Backbone=Claude-2, Ret...
2023.07
20.5
38
ReACT
Backbone=Text-Davinci-...
2023.07
12
28.5
ReACT
Backbone=Claude-2, Ret...
2023.07
5.5
31
ReACT & DFSDT
Backbone=Vicuna, Retri...
2023.07
0
0
ReACT & DFSDT
Backbone=Alpaca, Retri...
2023.07
0
0
Feedback
Search any
task
Search any
task