Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Agentic on τ2-Bench
Loading...
91.6
Score
Claude Opus 4.5
79.744
82.822
85.9
88.978
Feb 17, 2026
Score
Updated 4d ago
Evaluation Results
Method
Method
Links
Score
Claude Opus 4.5
2026.02
91.6
Gemini 3 Pro
2026.02
90.7
GLM-5
2026.02
89.7
GLM-4.7
2026.02
87.4
GPT-5.2 (xhigh)
2026.02
85.5
DeepSeek-V3.2
2026.02
85.3
Kimi K2.5
2026.02
80.2
Feedback
Search any
task
Search any
task