Every Activation Boosted: Scaling General Reasoner to 1 Trillion Open Language Foundation
About
We introduce Ling 2.0, a series reasoning-oriented language foundation built upon the principle that every activation boosts reasoning capability. Designed to scale from tens of billions to one trillion parameters under a unified Mixture-of-Experts (MoE) paradigm, Ling 2.0 emphasizes high sparsity, cross-scale consistency, and efficiency guided by empirical scaling laws. The series includes three non-thinking (instruct) models - Ling-mini-2.0, Ling-flash-2.0, and Ling-1T - ranging from 16B to 1T total parameters and achieving up to 7-fold active-compute efficiency compared with dense counterparts. Ling 2.0 integrates coordinated innovations across model architecture, pre-training, post-training, and infrastructure: a high-sparsity MoE with MTP for efficient reasoning, reasoning-oriented data and mid-training CoT activation, reinforcement-based fine-tuning (DFT, Evo-CoT), and full-scale FP8 training with fine-grained heterogeneous pipelines. At the trillion scale, Ling-1T establishes a new Pareto frontier of reasoning accuracy versus computational efficiency, demonstrating that sparse activation, when properly aligned with reasoning objectives, enables scalable and efficient intelligence. Collectively, Ling 2.0 provides a coherent, open, and efficient foundation for advancing future reasoning and thinking models, including the Ring series built upon the same base.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Commonsense Reasoning | WinoGrande | -- | 1085 | |
| General Knowledge | MMLU | MMLU General Knowledge Accuracy81 | 234 | |
| Logical reasoning | BBH | Accuracy83.25 | 201 | |
| Commonsense Reasoning | ARC-C | -- | 172 | |
| commonsense inference | HellaSwag | Accuracy84.69 | 91 | |
| Code | HumanEval | HumanEval Accuracy70.1 | 79 | |
| General Reasoning | BIG-Bench Hard | -- | 68 | |
| Math | GSM8K | Pass@190.75 | 47 | |
| Long-context Understanding | RULER 64k | Accuracy72.12 | 25 | |
| Code Generation | MBPP | MBPP Accuracy84.07 | 22 |