Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Agent on Tau2 Retail (English)
Loading...
71.6
Score
HyperCLOVA X 32B Think
56.416
60.358
64.3
68.242
Jan 3, 2026
Score
Updated 1mo ago
Evaluation Results
Method
Method
Links
Score
HyperCLOVA X 32B Think
User-simulator=GPT-4.1
2026.01
71.6
EXAONE 4.0 32B
User-simulator=GPT-4.1
2026.01
59.5
Qwen3-VL 32B-Thinking
User-simulator=GPT-4.1
2026.01
57
Feedback
Search any
task
Search any
task