Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Agentic Tool Calling on BFCL v3 (Multi-Turn)
Loading...
60.5
Multi-Turn Success Rate
Kimi-K2
32.42
39.71
47
54.29
May 29, 2026
Multi-Turn Success Rate
Updated 2d ago
Evaluation Results
Method
Method
Links
Multi-Turn Success Rate
Kimi-K2
Model Identifier=Kimi-K2
2026.05
60.5
MAVEN
Base Model=GPT-OSS-120b
2026.05
58.5
Qwen3-Th-235B
Model Identifier=Qwen3...
2026.05
53.5
o4-mini
Model Identifier=o4-mi...
2026.05
53
o3
Model Identifier=o3-20...
2026.05
44
DeepSeek-V3.1
Model Identifier=DeepS...
2026.05
44
Gemini-2.5
Model Identifier=gemin...
2026.05
35
GPT-5
Model Identifier=gpt-5...
2026.05
33.5
Feedback
Search any
task
Search any
task