Yuan3.0 Ultra: A Trillion-Parameter Enterprise-Oriented MoE LLM

About

We introduce Yuan3.0 Ultra, an open-source Mixture-of-Experts (MoE) large language model featuring 68.8B activated parameters and 1010B total parameters, specially designed to enhance performance on enterprise scenarios tasks while maintaining competitive capabilities on general purpose tasks. We propose Layer-Adaptive Expert Pruning (LAEP) algorithm designed for the pre-training stage of MoE LLMs. In contrast to previous expert pruning approaches that operate primarily in the post-training phase, the proposed algorithm enhances training efficiency by selectively pruning underutilized experts and reorganizing experts across computing devices according to token distribution statistics. Comprehensive experiments demonstrate that LAEP effectively reduces model size and substantially improves pre-training efficiency. When pre-training Yuan3.0 Ultra from scratch original with 1515B parameters, this algorithm delivers a 49\% boost in pre-training efficiency and a 33.3\% reduction in total parameters, while preserving the model's outstanding multi-domain performance. On enterprise scenario benchmarks including Docmatix, ChatRAG, SummEval and MMTab, Yuan3.0 Ultra achieves leading accuracy. The model and codes are publicly available at https://github.com/Yuan-lab-LLM/Yuan3.0-Ultra.

YuanLab.ai: Shawn Wu, Jiangang Luo, Darcy Chen, Sean Wang, Louie Li, Allen Wang, Xudong Zhao, Tong Yu, Bach Li, Joseph Shen, Gawain Ma, Jasper Jia, Marcus Mao, Claire Wang, Hunter He, Carol Wang, Zera Zhang, Jason Wang, Chonly Shen, Leo Zhang, Logan Chen, Qasim Meng, James Gong, Daniel Zhao, Penn Zheng, Owen Zhu• 2026

Related benchmarks

Task	Dataset	Result
Language Understanding	MMLU	Accuracy78	844
Math	GSM8K	Accuracy0.861	216
Coding	MBPP	Accuracy75.9	145
Mathematics	MATH	MATH Accuracy66.1	136
Code	HumanEval	HumanEval Accuracy70.7	118
Natural Language Understanding	ARC Challenge	Accuracy94.3	16
Training Efficiency	Yuan3.0-1T Pre-training Base (train)	TFLOPS92.6	6
Language	Pile (test)	Accuracy59.4	3
Language	NaturalQuestions	Accuracy0.433	3

Showing 9 of 9 rows

Other info

Follow for update

@wizwand_team Discord