K2-V2: A 360-Open, Reasoning-Enhanced LLM
About
We introduce K2-V2, a 360-open LLM built from scratch as a superior base for reasoning adaptation, in addition to functions such as conversation and knowledge retrieval from general LLMs. It stands as the strongest fully open model, rivals open-weight leaders in its size class, outperforms Qwen2.5-72B and approaches the performance of Qwen3-235B. We actively infuse domain knowledge, reasoning, long-context, and tool use throughout the training process. This explicitly prepares the model for complex reasoning tasks. We demonstrate this potential using simple supervised fine-tuning, establishing a strong baseline that indicates significant headroom for advanced alignment. By releasing the full training history and data composition, we maximize the effectiveness of continuous training, a key open source production scenario. We release the model weights and signature LLM360 artifacts, such as complete training data, to empower the community with a capable, reasoning-centric foundation.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Instruction Following | IFEval | -- | 292 | |
| Code Generation | HumanEval+ | -- | 189 | |
| Instruction Following | IFBench | Pass@1 (Strict)46.3 | 68 | |
| General Reasoning | BIG-Bench Hard | Accuracy87.6 | 68 | |
| Long-context Understanding | RULER | Performance @ 4K Context94.3 | 65 | |
| Question Answering | PopQA | Score32.2 | 50 | |
| Mathematical Reasoning | MATH | Score94.5 | 50 | |
| Code Generation | MBPP+ | Score66 | 43 | |
| Logic reasoning | ZebraLogic | Score79.2 | 42 | |
| Mathematics | Base Aggregate Math (test) | Score43.3 | 32 |