Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Agent on Terminal Bench Hard English
Loading...
9.9
Score
HyperCLOVA X 32B Think
3.348
5.049
6.75
8.451
Jan 3, 2026
Score
Updated 4d ago
Evaluation Results
Method
Method
Links
Score
HyperCLOVA X 32B Think
User-simulator=GPT-4.1
2026.01
9.9
Qwen3-VL 32B-Thinking
User-simulator=GPT-4.1
2026.01
6.9
EXAONE 4.0 32B
User-simulator=GPT-4.1
2026.01
3.6
Feedback
Search any
task
Search any
task