Share your thoughts, 1 month free Claude Pro on usSee more

Tool Use on API-Bench (test)

60Accuracy

Claude 3.5 Sonnet

Updated 1mo ago

Evaluation Results

Method	Links
Claude 3.5 Sonnet 2024.07		60
GPT-4o 2024.07		41.4
GPT-3.5 Turbo 2024.07		36.3
Llama 3 405B 2024.07		35.3
Llama 3 70B 2024.07		29.7
Mixtral 8x22B 2024.07		26
GPT-4 2024.07		22.5
Gemma 2 9B 2024.07		11.6
Llama 3 8B 2024.07		8.2
Mistral 7B 2024.07		4.7