Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Resting Neurons, Active Insights: Robustify Activation Sparsity for Large Language Models

About

Activation sparsity offers a compelling route to accelerate large language model (LLM) inference by selectively suppressing hidden activations, yet existing approaches exhibit severe accuracy degradation at high sparsity. We show that this failure stems from representational instability: *activation sparsity disrupts input-dependent activation learned during pretraining, inducing distribution shifts in hidden states.* We address this issue by reframing activation sparsity as a representational alignment problem and introducing **Spontaneous Neurons (SPON)**, a lightweight mechanism inspired by spontaneous neural activity in biological systems. SPON injects a small set of learnable, input-independent activation vectors that act as persistent representational anchors for sparse computation. These vectors are trained via distribution matching to the dense model and can be absorbed into bias terms after training, incurring negligible inference overhead. Across multiple LLM backbones, SPON consistently restores performance, stabilizes latent representations, and preserves generalization. Our results establish SPON as an effective and principled solution for reliable activation-sparse inference, and offer new insights into knowledge retention in LLMs.

Haotian Xu, Jiannan Yang, Tian Gao, Tsui-Wei Weng, Tengfei Ma• 2025

Related benchmarks

TaskDatasetResultRank
Medical Question AnsweringMedMCQA
Accuracy53.41
346
Question AnsweringCommonsenseQA
Accuracy74.28
148
Question AnsweringTruthfulQA
Accuracy57.15
73
Language ModelingWikitext (test)
Perplexity5.58
62
Question AnsweringMMLU
Accuracy68.79
46
Question AnsweringMathQA
Accuracy46.7
12
Showing 6 of 6 rows

Other info

Follow for update