Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Tool Use on API-Bench (test)
Loading...
60
Accuracy
Claude 3.5 Sonnet
2.488
17.419
32.35
47.281
Jul 31, 2024
Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
Claude 3.5 Sonnet
evaluation_mode=zero-shot
2024.07
60
GPT-4o
evaluation_mode=zero-shot
2024.07
41.4
GPT-3.5 Turbo
evaluation_mode=zero-shot
2024.07
36.3
Llama 3 405B
evaluation_mode=zero-shot
2024.07
35.3
Llama 3 70B
evaluation_mode=zero-shot
2024.07
29.7
Mixtral 8x22B
evaluation_mode=zero-shot
2024.07
26
GPT-4
evaluation_mode=zero-shot
2024.07
22.5
Gemma 2 9B
evaluation_mode=zero-shot
2024.07
11.6
Llama 3 8B
evaluation_mode=zero-shot
2024.07
8.2
Mistral 7B
evaluation_mode=zero-shot
2024.07
4.7
Feedback
Search any
task
Search any
task