Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Agent Task Completion on τ²-Bench

92.1Avg Task Reward

GPT-5.1 with H-EPM

83.88486.01788.1590.283Dec 8, 2025
Updated 4d ago

Evaluation Results

MethodLinks
2025.12
92.1
2025.12
84.2