Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Agent on τ2-Bench

69.5Accuracy

LongCat-Flash Exp-Chat

63.7865.26566.7568.235Dec 30, 2025
Updated 4d ago

Evaluation Results

MethodLinks
2025.12
69.5
2025.12
69.1
2025.12
68.8
2025.12
64