Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

API-Bank

Benchmarks

Task NameDataset NameSOTA ResultTrend
Tool CallingAPI-Bank L-1
F1 Name Match94.99
46
Tool CallingAPI-Bank L-2
Name Match F190.42
25
Tool-use InferenceAPI-Bank
Match Rate (#MAT)5.8
22
Tool UseAPI-Bank (test)
Accuracy92.6
16
Tool-augmented reasoningAPI-Bank
Success Rate79.1
12
Tool CallingAPI-Bank L-2 v1 (test)
F1 Name Match88
12
Tool CallingAPI-Bank L-1 v1 (test)
F1 Score90.78
12
Function CallingAPI-Bank Level-2
ROUGE-L83.2
12
Function CallingAPI-Bank Level-1
ROUGE-L93.4
12
Tool UseAPI Bank
Accuracy90
10
Tool LearningAPI-Bank LV2
Correctness62.41
10
Single-agent tool useAPI-Bank reconstructed
Correctness79.27
9
Showing 12 of 12 rows