Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

ROMER: Expert Replacement and Router Calibration for Robust MoE LLMs on Analog Compute-in-Memory Systems

About

Large language models (LLMs) with mixture-of-experts (MoE) architectures achieve remarkable scalability by sparsely activating a subset of experts per token, yet their frequent expert switching creates memory bandwidth bottlenecks that compute-in-memory (CIM) architectures are well-suited to mitigate. However, analog CIM systems suffer from inherent hardware imperfections that perturb stored weights, and its negative impact on MoE-based LLMs in noisy CIM environments remains unexplored. In this work, we present the first systematic investigation of MoE-based LLMs under noise model calibrated with real chip measurements, revealing that hardware noise critically disrupts expert load balance and renders clean-trained routing decisions consistently suboptimal. Based on these findings, we propose ROMER, a post-training calibration framework that (1) replaces underactivated experts with high-frequency ones to restore load balance, and (2) recalibrates router logits via percentile-based normalization to stabilize routing under noise. Extensive experiments across multiple benchmarks demonstrate that ROMER achieves up to 58.6\%, 58.8\%, and 59.8\% reduction in perplexity under real-chip noise conditions for DeepSeek-MoE, Qwen-MoE, and OLMoE, respectively, establishing its effectiveness and generalizability across diverse MoE architectures.

Wenyong Zhou, Yuannuo Feng, Yizhe Chen, Taiqiang Wu, Wendong Xu, Wenbo Qi, Zhengwu Liu, Wang Kang, Ngai Wong• 2026

Related benchmarks

TaskDatasetResultRank
Language ModelingWikiText-2
Perplexity (PPL)20.73
2320
Commonsense ReasoningARC Challenge
Accuracy45
243
Commonsense ReasoningPIQA
Accuracy78.6
213
Language ModelingLAMBADA
Perplexity12.2
198
Common Sense ReasoningARC Easy
ARC (easy) Accuracy69.4
101
Language ModelingWikiText-103 v1 (test)
Perplexity10.82
56
Showing 6 of 6 rows

Other info

Follow for update