MiniCPM4: Ultra-Efficient LLMs on End Devices
About
This paper introduces MiniCPM4, a highly efficient large language model (LLM) designed explicitly for end-side devices. We achieve this efficiency through systematic innovation in four key dimensions: model architecture, training data, training algorithms, and inference systems. Specifically, in terms of model architecture, we propose InfLLM v2, a trainable sparse attention mechanism that accelerates both prefilling and decoding phases for long-context processing. Regarding training data, we propose UltraClean, an efficient and accurate pre-training data filtering and generation strategy, and UltraChat v2, a comprehensive supervised fine-tuning dataset. These datasets enable satisfactory model performance to be achieved using just 8 trillion training tokens. Regarding training algorithms, we propose ModelTunnel v2 for efficient pre-training strategy search, and improve existing post-training methods by introducing chunk-wise rollout for load-balanced reinforcement learning and data-efficient tenary LLM, BitCPM. Regarding inference systems, we propose CPM.cu that integrates sparse attention, model quantization, and speculative sampling to achieve efficient prefilling and decoding. To meet diverse on-device requirements, MiniCPM4 is available in two versions, with 0.5B and 8B parameters, respectively. Furthermore, we construct a hybrid reasoning model, MiniCPM4.1, which can be used in both deep reasoning mode and non-reasoning mode. Evaluation results demonstrate that MiniCPM4 and MiniCPM4.1 outperform similar-sized open-source models across benchmarks, with the 8B variants showing significant speed improvements on long sequence understanding and generation.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Reasoning | BBH | -- | 507 | |
| Mathematical Reasoning | AIME 25 | Accuracy72.08 | 201 | |
| Knowledge | MMLU-Pro | Score72.7 | 30 | |
| Coding | HumanEval | HumanEval Mean Score0.9146 | 28 | |
| Math | AIME24 | -- | 20 | |
| Coding | LiveCodeBench v5 | Accuracy56.89 | 18 | |
| Coding | MBPP | Score91.05 | 11 | |
| Knowledge | CMMLU | Knowledge Score84.72 | 6 | |
| Overall | Standard Evaluation Suite | Average Score0.7613 | 6 | |
| Other | IFEval | Score77.45 | 6 |