MoNE: Replacing Redundant Experts with Lightweight Novices for Structured Pruning of MoE

About

Mixture-of-Experts (MoE) enables efficient scaling of large language models by activating only a subset of experts per input token. However, deploying MoE-based models incurs significant memory overhead due to the need to retain all experts in memory. While structured pruning is promising to reduce memory costs, existing methods often show suboptimal performance and unstable degradation in three dimensions: model architectures, calibration data sources, and calibration sample sizes. This paper proposes Mixture-of-Novices-and-Experts (MoNE), a novel expert pruning method that replaces redundant experts with lightweight novices to achieve effective and robust model compression. MoNE evaluates expert redundancy based on two metrics: access frequency and output variance. Experts exhibiting low usage and stable outputs are pruned and replaced with lightweight novices-unbiased estimations of their original outputs-minimizing performance degradation. Extensive experiments demonstrate that MoNE consistently outperforms baseline methods with minimal accuracy degradation across the three dimensions, confirming its effectiveness and robustness. Notably, it outperforms baselines by up to 2.72 for the average zero shot accuracy across nine downstream tasks under 25% pruning ratio, with only 0.14 performance drop for Qwen2-57B-A14B. The code is available at https://github.com/zxgx/mode-pd.

Geng Zhang, Yuxuan Han, Yuxuan Lou, Yiqi Zhang, Wangbo Zhao, Yang You• 2025

Related benchmarks

Task	Dataset	Result
Question Answering	ARC Challenge	Accuracy56.14	906
Question Answering	ARC Easy	Accuracy80.6	597
Natural Language Inference	RTE	Accuracy77.98	590
Multitask Language Understanding	MMLU	Accuracy74.04	520
Question Answering	OpenBookQA	Accuracy46.8	465
Commonsense Reasoning	WinoGrande	Accuracy70.17	453
Boolean Question Answering	BoolQ	Accuracy85.41	350
Question Answering	BoolQ	Accuracy89.11	317
Common Sense Reasoning	COPA	Accuracy94	256
Question Answering	OpenBookQA	Accuracy42	145

Showing 10 of 16 rows

Other info

Follow for update

@wizwand_team Discord