Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Tool Calling on ToolACE 500 tools
Loading...
82.28
TSA
JTPRO
63.09408
68.07504
73.056
78.03696
Apr 20, 2026
TSA
SFA
OSR
Updated 1mo ago
Evaluation Results
Method
Method
Links
TSA
SFA
OSR
JTPRO
Backbone=GPT-5
2026.04
82.28
90
74.38
GEPA
Backbone=GPT-5
2026.04
77.17
85.75
66.12
JTPRO
Backbone=o3-mini
2026.04
76.19
88.52
65.29
JTPRO
Backbone=GPT-4o mini
2026.04
75.25
88.12
69.42
GEPA
Backbone=GPT-4o mini
2026.04
73.33
85.27
61.98
GEPA
Backbone=o3-mini
2026.04
73.33
85.27
61.98
Base
Backbone=GPT-5
2026.04
73.02
84.785
62.73
Base
Backbone=o3-mini
2026.04
70.78
84.994
59.454
Base
Backbone=GPT-4o mini
2026.04
63.832
86.996
60
Feedback
Search any
task
Search any
task