Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Agent on τ2-Bench
Loading...
69.5
Accuracy
LongCat-Flash Exp-Chat
63.78
65.265
66.75
68.235
Dec 30, 2025
Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
LongCat-Flash Exp-Chat
Evaluation Mode=Chat
2025.12
69.5
GLM 4.6
Evaluation Mode=Chat
2025.12
69.1
LongCat-Flash Chat
Evaluation Mode=Chat
2025.12
68.8
DeepSeek V3.2
Evaluation Mode=Chat
2025.12
64
Feedback
Search any
task
Search any
task