Mi:dm K 2.5 Pro
About
The evolving LLM landscape requires capabilities beyond simple text generation, prioritizing multi-step reasoning, long-context understanding, and agentic workflows. This shift challenges existing models in enterprise environments, especially in Korean-language and domain-specific scenarios where scaling is insufficient. We introduce Mi:dm K 2.5 Pro, a 32B parameter flagship LLM designed to address enterprise-grade complexity through reasoning-focused optimization. Our methodology builds a robust data foundation via a quality-centric curation pipeline utilizing abstract syntax tree (AST) analysis for code, gap-filling synthesis for mathematics, and an LLM-based quality evaluator. Pre-training scales the model via layer-predictor-based Depth Upscaling (DuS) and a progressive strategy supporting a 128K token context window. Post-training introduces a specialized multi-stage pipeline, including Reasoning SFT, model merging, and asynchronous reinforcement learning (RL), to develop complex problem-solving skills. "Fusion Training" then rebalances these capabilities with conversational fluency, consistent response styling, and reliable tool-use. The evaluations show that Mi:dm K 2.5 Pro achieves competitive performance against leading global and domestic models. In addition, it sets state-of-the-art results on Korean-specific benchmarks, showcasing deep linguistic and cultural understanding. Finally, Responsible AI evaluations validate safety against attacks, ensuring a secure profile for deployment with a balance of harmlessness and responsiveness.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Instruction Following | IFEval | -- | 625 | |
| Coding | HumanEval+ | Pass@192.07 | 83 | |
| Coding | MBPP+ | Pass@189.68 | 52 | |
| General Knowledge | MMLU-Pro | MMLU-Pro General Knowledge EM81.8 | 22 | |
| Coding | LiveCodeBench v6 | Pass@174.79 | 20 | |
| Mathematics | AIME25 | Exact Match70 | 18 | |
| Trustworthiness evaluation | LLM Trustworthiness Benchmark | Bias Score89.58 | 17 | |
| Bias Evaluation | KoBBQ | Ambiguous Context Score94.56 | 17 | |
| Instruction Following | Ko-IFEval | Overall Score85.6 | 13 | |
| Language Comprehension | Korean Comprehension 1.0 (test) | Ko-Sov (EM)73.5 | 9 |