Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Function Calling on BFCL (Accuracy %)
Loading...
86.67
Accuracy (%)
Opus-4.6 (1M)
-3.4668
19.9341
43.335
66.7359
May 31, 2026
Accuracy (%)
Updated 1d ago
Evaluation Results
Method
Method
Links
Accuracy (%)
Opus-4.6 (1M)
Protocol=Agent-Post-Tr...
2026.05
86.67
Official Instruct Model
Protocol=Official Inst...
2026.05
84
GLM-4.7 & ANDES (Ours)
Protocol=Proposed Meth...
2026.05
78.67
Opus-4.6
Protocol=Agent-Post-Tr...
2026.05
75.92
GLM-4.7 (Scaffold-only)
Protocol=Proposed Meth...
2026.05
70
Opus-4.7 (xHigh)
Protocol=Agent-Post-Tr...
2026.05
62.33
GPT-5.2
Protocol=Agent-Post-Tr...
2026.05
33.33
GPT-5.4 (High)
Protocol=Agent-Post-Tr...
2026.05
29.67
Gemini-3.1-Pro
Protocol=Agent-Post-Tr...
2026.05
27.67
Base Model (SmolLM3-3B)
Protocol=Zero-Shot, Ba...
2026.05
0
Sonnet-4.5
Protocol=Agent-Post-Tr...
2026.05
0
Qwen3-Max
Protocol=Agent-Post-Tr...
2026.05
0
Kimi-K2-Thinking
Protocol=Agent-Post-Tr...
2026.05
0
MiniMax-M2.1
Protocol=Agent-Post-Tr...
2026.05
0
GPT-5.1-Codex-Max
Protocol=Agent-Post-Tr...
2026.05
0
MiniMax-M2.5
Protocol=Agent-Post-Tr...
2026.05
0
GLM-4.7 (OpenCode)
Protocol=Proposed Meth...
2026.05
0
Feedback
Search any
task
Search any
task