Every Activation Boosted: Scaling General Reasoner to 1 Trillion Open Language Foundation
About
We introduce Ling 2.0, a series reasoning-oriented language foundation built upon the principle that every activation boosts reasoning capability. Designed to scale from tens of billions to one trillion parameters under a unified Mixture-of-Experts (MoE) paradigm, Ling 2.0 emphasizes high sparsity, cross-scale consistency, and efficiency guided by empirical scaling laws. The series includes three non-thinking (instruct) models - Ling-mini-2.0, Ling-flash-2.0, and Ling-1T - ranging from 16B to 1T total parameters and achieving up to 7-fold active-compute efficiency compared with dense counterparts. Ling 2.0 integrates coordinated innovations across model architecture, pre-training, post-training, and infrastructure: a high-sparsity MoE with MTP for efficient reasoning, reasoning-oriented data and mid-training CoT activation, reinforcement-based fine-tuning (DFT, Evo-CoT), and full-scale FP8 training with fine-grained heterogeneous pipelines. At the trillion scale, Ling-1T establishes a new Pareto frontier of reasoning accuracy versus computational efficiency, demonstrating that sparse activation, when properly aligned with reasoning objectives, enables scalable and efficient intelligence. Collectively, Ling 2.0 provides a coherent, open, and efficient foundation for advancing future reasoning and thinking models, including the Ring series built upon the same base.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Logical reasoning | BBH | Accuracy83.25 | 93 | |
| General Reasoning | BIG-Bench Hard | -- | 68 | |
| Code Generation | MBPP | MBPP Accuracy84.07 | 22 | |
| Multitask Language Understanding | MMLU | MMLU Score78.5 | 7 | |
| Mathematical Reasoning | gsm | GSM Accuracy90.75 | 7 | |
| Mathematical Reasoning | GSM8K | GSM Score90.75 | 7 | |
| Coding | Composite CRUXEval-O, MBPP, MBPP+, MultiPL-E, HumanEval, HumanEval+, HumanEvalFix, HumanEval-cn, BigCodeBench-Full, LiveCodeBench, Aider, BIRD-SQL, Spider | CRUXEval-O Score76.12 | 4 | |
| Knowledge Evaluation | Composite (MMLU, MMLU-Pro, CMMLU, C-EVAL, GAOKAO-Bench, ARC-c, GPQA, SciBench, PHYBench, TriviaQA) | Overall Average Score65.77 | 4 | |
| Knowledge Retrieval & Understanding | Knowledge Suite | MMLU87.98 | 4 | |
| Logical and Commonsense Reasoning | Reasoning Suite | BIG-Bench Hard89.36 | 4 |