Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Tool Calling on Average across 5 benchmarks
Loading...
88.47
F1 (Name)
HiTEC-ICL
56.0636
64.4768
72.89
81.3032
May 28, 2025
F1 (Name)
F1 (Name + Param)
Updated 4d ago
Evaluation Results
Method
Method
Links
F1 (Name)
F1 (Name + Param)
HiTEC-ICL
Backbone=GPT-4-Turbo
2025.05
88.47
79.12
Vanilla
Backbone=GPT-4-Turbo
2025.05
88.39
77.86
HiTEC-ICL
Backbone=Llama3.1-8B
2025.05
85.59
66.42
CoT
Backbone=GPT-4-Turbo
2025.05
84.84
75.62
Vanilla
Backbone=Llama3.1-8B
2025.05
83.65
64.67
HiTEC-ICL
Backbone=Llama3-8B
2025.05
83.12
65.52
CoT
Backbone=Llama3.1-8B
2025.05
73.87
60.08
CoT
Backbone=Llama3-8B
2025.05
59.49
52.03
Vanilla
Backbone=Llama3-8B
2025.05
57.31
48.29
Feedback
Search any
task
Search any
task