Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Tool Use on ACEBench Parallel
Loading...
81
Accuracy
Base
56.04
62.52
69
75.48
Apr 13, 2026
Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
Base
Model=ToolACE-2.5
2026.04
81
Prompt
Model=ToolACE-2.5
2026.04
81
CAA
Model=ToolACE-2.5
2026.04
79
Prompt
Model=Watt-Tool
2026.04
77
Base
Model=Watt-Tool
2026.04
76
CAA
Model=Watt-Tool
2026.04
75
Prompt
Model=Qwen3-8B
2026.04
66
CAA
Model=Qwen3-8B
2026.04
66
CAA
Model=Qwen3-14B
2026.04
63
Base
Model=Qwen3-8B
2026.04
62
Base
Model=Qwen3-14B
2026.04
62
Prompt
Model=Qwen3-14B
2026.04
62
CAA
Model=Qwen3-4B
2026.04
61
Base
Model=Qwen3-4B
2026.04
60
Prompt
Model=Qwen3-4B
2026.04
57
Feedback
Search any
task
Search any
task