SymbolicLight V1: Spike-Gated Dual-Path Language Modeling with High Activation Sparsity and Sub-Billion-Scale Pre-Training Evidence

About

Natively trained spiking language models struggle to combine Transformer-like language quality, stable multi-domain pre-training, and high activation sparsity. We present SymbolicLight V1, a spike-gated dual-path language model that combines binary Leaky Integrate-and-Fire spike dynamics with a continuous residual stream. Its Dual-Path SparseTCAM module replaces dense self-attention with an exponential-decay aggregation path for long-range memory and a spike-gated local attention path for short-range precision, complemented by a dynamic context-conditioned decoding head and a bilingual tokenizer. A 194M-parameter SymbolicLight V1 model trained from scratch on a 3B-token Chinese-English corpus reaches held-out validation PPL 8.88-8.93 across four independent runs at >89% per-element activation sparsity. It trails GPT-2 201M by 7.7% in PPL while surpassing GPT-2 124M under the reported comparison. Component ablations at matched 0.5B-token training budgets show that the spike-gated local attention path is the largest contributor, and that replacing LIF dynamics with a deterministic top-k mask at matched sparsity causes a larger degradation, indicating that temporal integration rather than sparsity alone drives performance. We also report a 0.8B-parameter scale-up run trained on 48.8B tokens as evidence of optimization and sparsity preservation, not as a primary quality comparison. Current dense-hardware inference is slower than GPT-2, so neuromorphic deployment is presented as a future sparsity-driven opportunity rather than an achieved hardware speedup.

Ting Liu• 2026

Related benchmarks

Task	Dataset	Result
Physical Commonsense Reasoning	PIQA (val test)	Accuracy53.8	8
Language Modeling	10-domain bilingual corpus (val)	Validation Perplexity8.88	7
Grade-school science	ARC Easy (full test val)	Accuracy32.1	3
Commonsense Sentence Completion	HellaSwag full (test val)	Accuracy26.5	3
Long-range word prediction	LAMBADA full (test val)	Accuracy80.2	3
Science Question Answering	SciQ full (test val)	Accuracy31.1	3

Showing 6 of 6 rows

Other info

Follow for update

@wizwand_team Discord