Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Agent Task Completion on τ²-Bench
Loading...
92.1
Avg Task Reward
GPT-5.1 with H-EPM
83.884
86.017
88.15
90.283
Dec 8, 2025
Avg Task Reward
Updated 4d ago
Evaluation Results
Method
Method
Links
Avg Task Reward
GPT-5.1 with H-EPM
Model=GPT-5.1, Enhance...
2025.12
92.1
GPT-5.1 base
Model=GPT-5.1, Configu...
2025.12
84.2
Feedback
Search any
task
Search any
task