Resting Neurons, Active Insights: Robustifying Activation Sparsity in LLMs via Spontaneity

About

Activation sparsity offers a compelling route to accelerate large language model (LLM) inference by selectively suppressing hidden activations, yet existing approaches exhibit severe accuracy degradation at high sparsity. We show that this failure stems from representational instability: *activation sparsity disrupts input-dependent activation learned during pretraining, inducing distribution shifts in hidden states.* We address this issue by reframing activation sparsity as a representational alignment problem and introducing **Spontaneous Neurons (SPON)**, a lightweight mechanism inspired by spontaneous neural activity in biological systems. SPON injects a small set of learnable, input-independent activation vectors that act as persistent representational anchors for sparse computation. These vectors are trained via distribution matching to the dense model and can be absorbed into bias terms after training, incurring negligible inference overhead. Across multiple LLM backbones, SPON consistently restores performance, stabilizes latent representations, and preserves generalization. Our results establish SPON as an effective and principled solution for reliable activation-sparse inference, and offer new insights into knowledge retention in LLMs.

Haotian Xu, Jiannan Yang, Tian Gao, Tsui-Wei Weng, Tengfei Ma• 2025

Related benchmarks

Task	Dataset	Result
Medical Question Answering	MedMCQA	Accuracy53.41	521
Question Answering	CommonsenseQA	Accuracy74.28	150
Question Answering	TruthfulQA	Accuracy57.15	73
Language Modeling	Wikitext (test)	Perplexity5.58	66
Question Answering	MMLU	Accuracy68.79	46
Question Answering	MathQA	Accuracy46.7	36

Showing 6 of 6 rows

Other info

Follow for update

@wizwand_team Discord