HyperCLOVA X 32B Think
About
In this report, we present HyperCLOVA X 32B Think, a vision-language model designed with particular emphasis on reasoning within the Korean linguistic and cultural context, as well as agentic ability. HyperCLOVA X 32B Think is pre-trained with a strong focus on reasoning capabilities and subsequently post-trained to support multimodal understanding, enhanced reasoning, agentic behaviors, and alignment with human preferences. Experimental evaluations against comparably sized models demonstrate that our model achieves strong performance on Korean text-to-text and vision-to-text benchmarks, as well as on agent-oriented evaluation tasks. By open-sourcing HyperCLOVA X 32B Think, we aim to support broader adoption and facilitate further research and innovation across both academic and industrial communities.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Visual Question Answering | TextVQA | -- | 1117 | |
| Agentic Task | Tau2-Telecom | Accuracy92.3 | 13 | |
| Agentic Tasks | Tau2-Airline | Score58 | 10 | |
| Agentic Tasks | Tau2 Retail | Score71.6 | 10 | |
| Text-to-Text | KoBALT Korean | Score50.6 | 4 | |
| Text-to-Text | CLIcK Korean | Score75.2 | 4 | |
| Text-to-Text | HAERAE Bench Korean 1.0 | Score87.4 | 4 | |
| Text-to-Text | Flores+ En to Ko | Score31.8 | 4 | |
| Text-to-Text | KMMLU Korean | Score71.3 | 4 | |
| Text-to-Text | MMLU English | Score87.7 | 4 |