Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Mixture of Universal Experts: Scaling Virtual Width via Depth-Width Transformation

About

Mixture-of-Experts (MoE) decouples model capacity from per-token computation, yet their scalability remains limited by the physical dimensions of depth and width. To overcome this, we propose Mixture of Universal Experts (MOUE),a MoE generalization introducing a novel scaling dimension: Virtual Width. In general, MoUE aims to reuse a universal layer-agnostic expert pool across layers, converting depth into virtual width under a fixed per-token activation budget. However, two challenges remain: a routing path explosion from recursive expert reuse, and a mismatch between the exposure induced by reuse and the conventional load-balancing objectives. We address these with three core components: a Staggered Rotational Topology for structured expert sharing, a Universal Expert Load Balance for depth-aware exposure correction, and a Universal Router with lightweight trajectory state for coherent multi-step routing. Empirically, MoUE consistently outperforms matched MoE baselines by up to 1.3% across scaling regimes, enables progressive conversion of existing MoE checkpoints with up to 4.2% gains, and reveals a new scaling dimension for MoE architectures.

Yilong Chen, Naibin Gu, Junyuan Shang, Zhenyu Zhang, Yuchen Feng, Jiawei Sheng, Tingwen Liu, Shuohuan Wang, Yu Sun, Hua Wu, Haifeng Wang• 2026

Related benchmarks

TaskDatasetResultRank
Commonsense ReasoningHellaSwag
Accuracy80.3
1891
Commonsense ReasoningWinoGrande
Accuracy57.9
1085
Code GenerationHumanEval--
1036
Question AnsweringARC Challenge
Accuracy66.8
906
Question AnsweringARC Easy
Accuracy79.9
597
KnowledgeMMLU
Accuracy50.4
136
Question AnsweringTriviaQA
Accuracy51.2
112
Question AnsweringNatural Questions (NQ)
Accuracy21.4
48
General Language UnderstandingNLP Evaluation Suite (SciQ, PIQA, WG, ARC, HellaSwag, LogiQA, BoolQ, LAMBADA)
SciQ Accuracy58.3
14
Showing 9 of 9 rows

Other info

Follow for update