Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Berkeley Function Calling Leaderboard

Benchmarks

Task NameDataset NameSOTA ResultTrend
Function callingBFCL (Berkeley Function Calling Leaderboard)
Base Score41.8
28
Function CallingBerkeley Function Calling Leaderboard (BFCL) Overall November 19, 2025
Non-live Accuracy69.44
20
Function CallingBerkeley Function Calling Leaderboard (BFCL) Live and Non-live
Non-live AST Score90.8
11
Function CallingBerkeley Function Calling Leaderboard (BFCL) v4
Simple Accuracy77.25
9
Function CallingBerkeley Function-Calling Leaderboard (BFCL)
Non-Live Multiple AST Success Rate96
7
Function CallingBerkeley Function Calling Leaderboard (BFCL) Live v3
Score53.8
6
Function CallingBerkeley Function Calling Leaderboard (BFCL) Non Live
Score70.1
6
Function CallingBerkeley Function Calling Leaderboard (BFCL) Extended Setting (Non-Live)
Simple Success Rate74.92
6
Function CallingBerkeley Function Calling Leaderboard (BFCL)
Overall Success68.92
5
Tool / AgentBerkeley Function Calling Leaderboard EN
Score36.17
2
Showing 10 of 10 rows