Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

ToolAce

Benchmarks

Task NameDataset NameSOTA ResultTrend
Tool CallingToolACE 1000 tools
TSA78.72
9
Tool CallingToolACE 500 tools
TSA82.28
9
Intent AlignmentToolACE
Aintent (GPT-5.0)85.71
6
Intent Inversion AttackToolACE
S_text81.39
6
Tool CallingToolACE multi-turn (test)
Accuracy61.64
2
Tool IdentificationToolACE multi-turn (test)
Accuracy75.34
2
Function CallingToolAce (test)
Accuracy (Single)84.4
2
Showing 7 of 7 rows