Plug-and-Play Spiking Operators: Breaking the Nonlinearity Bottleneck in Spiking Transformers

About

ANN-to-SNN conversion offers a practical, training-free route to spiking large language models. However, current pipelines primarily focus on spike-driven realizations for Transformer linear-algebra operations, while providing limited support for key nonlinear operators. This gap limits compatibility with neuromorphic-style execution constraints, where such nonlinearities typically require division, exponentiation, or norm computations that are not naturally supported by standard leaky integrate-and-fire dynamics. To solve this problem, we propose a plug-and-play framework that implements spike-friendly approximations for Transformer nonlinearities and integrates into existing ANN-to-SNN pipelines. Our method decomposes these nonlinear computations into three recurring primitives -- division, exponentiation, and $\ell_2$ norms -- and realizes them via population computation using LIF neuron groups, combined with lightweight bit-shift scaling to avoid floating-point arithmetic. By composing these primitives as modular operator blocks, our framework supports common Transformer nonlinearities (e.g., Softmax, SiLU, and normalization) without any fine-tuning. Experiments on a range of LLMs Transformers show that selectively replacing the targeted nonlinear operators incurs less than a $1\%$ accuracy drop across all evaluated tasks.

Xinzhe Yuan, Xiang Peng, Bin Gu, Huan Xiong (1) __INSTITUTION_4__ IASM, Harbin Institute of Technology, (2) School of Artificial Intelligence, Jilin University)• 2026

Related benchmarks

Task	Dataset	Result
Language Understanding	WinoGrande	Accuracy79.08	38
Natural Language Understanding	ARC Easy	Accuracy72.77	36
Natural Language Understanding	HellaSwag	Accuracy85.6	35
Natural Language Understanding	PIQA	PIQA Accuracy83.9	16
Natural Language Understanding	ARC Challenge	Accuracy62.54	16

Showing 5 of 5 rows

Other info

Follow for update

@wizwand_team Discord