Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Tool-calling on Retail-3I General 1.0
Loading...
73.6
Pass@1
Qwen3-8B
71.936
72.368
72.8
73.232
Jan 28, 2026
Pass@1
Pass@2
Pass@3
Updated 4d ago
Evaluation Results
Method
Method
Links
Pass@1
Pass@2
Pass@3
Qwen3-8B
Training=Fine-tuned
2026.01
73.6
70.9
69.1
Qwen3-4B
Training=Fine-tuned
2026.01
72
70.3
69.1
Feedback
Search any
task
Search any
task