Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

APIGen

Benchmarks

Task NameDataset NameSOTA ResultTrend
Tool RetrievalAPIGen
NDCG@100.8575
44
Argument GenerationAPIGen sampled (test)
Argument F1 (2 calls)88.1
15
Tool SelectionAPIGen sampled (test)
Tool Selection F1 (2 calls)99.4
15
Tool-Calling and Answer GenerationAPIGen-MT (test)
Action Recall90.18
4
Function CallingAPIGen (test)
Score (Single)89.6
2
Showing 5 of 5 rows