Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Function Calling on BFCL (Accuracy)
Loading...
77.9
Accuracy
Llama 3.1 Instruct
35.78
46.715
57.65
68.585
Apr 23, 2025
Jun 16, 2025
Aug 10, 2025
Oct 4, 2025
Nov 27, 2025
Jan 21, 2026
Mar 17, 2026
Accuracy
Updated 1mo ago
Evaluation Results
Method
Method
Links
Accuracy
Llama 3.1 Instruct
Model Scale=70B
2025.04
77.9
ParamΔ
Model Scale=70B
2025.04
77.7
Llama 3 Instruct
Model Scale=70B
2025.04
76.8
Qwen3-14B
Group=Larger, Configur...
2026.03
74.3
Qwen3-30B-A3B
Group=Larger, Configur...
2026.03
74.1
Llama 3.1 Instruct
Model Scale=8B
2025.04
67.9
ParamΔ
Model Scale=8B
2025.04
60.9
Llama 3 Instruct
Model Scale=8B
2025.04
60.1
Gpt-oss-20b-high
Group=Larger, Configur...
2026.03
58.9
gemma-3-12b-it
Group=Larger, Configur...
2026.03
52.2
EngGPT2-16B-A3B
Group=Comparable, Conf...
2026.03
48.5
gemma-2-9b-it
Group=Comparable, Conf...
2026.03
44.4
Moonlight-16B-A3B-Instruct
Group=Comparable, Conf...
2026.03
42.2
Llama-3.1-8B-Instruct
Group=Comparable, Conf...
2026.03
37.4
Feedback
Search any
task
Search any
task