Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

API-Bank

Benchmarks

Task NameDataset NameSOTA ResultTrend
Tool CallingAPI-Bank L-1
F1 Name Match94.99
46
Tool CallingAPI-Bank L-2
Name Match F190.42
25
Tool-augmented reasoningAPI-Bank
Success Rate79.1
12
Tool CallingAPI-Bank L-2 v1 (test)
F1 Name Match88
12
Tool CallingAPI-Bank L-1 v1 (test)
F1 Score90.78
12
Function CallingAPI-Bank Level-2
ROUGE-L83.2
12
Function CallingAPI-Bank Level-1
ROUGE-L93.4
12
Tool LearningAPI-Bank LV2
Correctness62.41
10
Tool UseAPI-Bank (test)
Accuracy92.6
10
Single-agent tool useAPI-Bank reconstructed
Correctness79.27
9
Showing 10 of 10 rows