Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Tool Calling on ToolACE 1000 tools
Loading...
78.72
TSA
JTPRO
58.12384
63.47092
68.818
74.16508
Apr 20, 2026
TSA
SFA
OSR
Updated 1mo ago
Evaluation Results
Method
Method
Links
TSA
SFA
OSR
JTPRO
Backbone=GPT-5
2026.04
78.72
89.26
73.55
JTPRO
Backbone=GPT-4o mini
2026.04
75.13
83.59
63.64
GEPA
Backbone=GPT-5
2026.04
75.13
86.4
67.77
GEPA
Backbone=GPT-4o mini
2026.04
73.39
83.36
60.33
JTPRO
Backbone=o3-mini
2026.04
71.48
87.46
64.46
GEPA
Backbone=o3-mini
2026.04
70.11
86.59
58.68
Base
Backbone=GPT-5
2026.04
67.658
87.35
62.366
Base
Backbone=GPT-4o mini
2026.04
61.115
86.67
58.18
Base
Backbone=o3-mini
2026.04
58.916
85.036
51.272
Feedback
Search any
task
Search any
task