API-Bank

Benchmarks

Task Name	Dataset Name	SOTA Result
Tool Calling	API-Bank L-1	F1 Name Match94.99	46
Stepwise tool-use	API-Bank (test)	Success Rate74	28
Tool Calling	API-Bank L-2	Name Match F190.42	25
Tool-use Inference	API-Bank	Match Rate (#MAT)5.8	22
Function Calling	API-Bank	Level-1 Score79.17	20
Tool Use	API-Bank (test)	Overall Accuracy75.54	19
API Use	API-Bank	Success Rate77.19	18
Tool Use	API-Bank Level 2	Accuracy66.22	18
Tool-augmented reasoning	API-Bank	Success Rate79.1	12
Tool Calling	API-Bank L-2 v1 (test)	F1 Name Match88	12
Tool Calling	API-Bank L-1 v1 (test)	F1 Score90.78	12
Function Calling	API-Bank Level-2	ROUGE-L83.2	12
Function Calling	API-Bank Level-1	ROUGE-L93.4	12
Tool Use	API Bank	Accuracy90	10
Tool Learning	API-Bank LV2	Correctness62.41	10
Single-agent tool use	API-Bank reconstructed	Correctness79.27	9
Agentic Capability	API-Bank	Accuracy90.45	8
Tool Retrieval and Calling	API-Bank Call+Retrieve	Task Completion Rate26.9	8
Tool Calling	API-Bank Call	Task Completion Rate34.7	8
Tool Retrieval and Invocation	API-Bank Level-3	Recall@k90.67	7
Tool Use	API-Bank (L1)	Score81.3	6
Tool Sequence Recommendation	API-Bank Level-3 50 instances (LOO-CV)	Set F194.5	6
Tool Use	API-Bank L2 cleaned (test)	F1 (API Matching)87.32	5
Tool Selection	API-Bank (test)	Recall@159.12	4

Showing 24 of 24 rows