Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

SymbolicLight V1: Spike-Gated Dual-Path Language Modeling with High Activation Sparsity and Sub-Billion-Scale Pre-Training Evidence

About

Natively trained spiking language models struggle to combine Transformer-like language quality, stable multi-domain pre-training, and high activation sparsity. We present SymbolicLight V1, a spike-gated dual-path language model that combines binary Leaky Integrate-and-Fire spike dynamics with a continuous residual stream. Its Dual-Path SparseTCAM module replaces dense self-attention with an exponential-decay aggregation path for long-range memory and a spike-gated local attention path for short-range precision, complemented by a dynamic context-conditioned decoding head and a bilingual tokenizer. A 194M-parameter SymbolicLight V1 model trained from scratch on a 3B-token Chinese-English corpus reaches held-out validation PPL 8.88-8.93 across four independent runs at >89% per-element activation sparsity. It trails GPT-2 201M by 7.7% in PPL while surpassing GPT-2 124M under the reported comparison. Component ablations at matched 0.5B-token training budgets show that the spike-gated local attention path is the largest contributor, and that replacing LIF dynamics with a deterministic top-k mask at matched sparsity causes a larger degradation, indicating that temporal integration rather than sparsity alone drives performance. We also report a 0.8B-parameter scale-up run trained on 48.8B tokens as evidence of optimization and sparsity preservation, not as a primary quality comparison. Current dense-hardware inference is slower than GPT-2, so neuromorphic deployment is presented as a future sparsity-driven opportunity rather than an achieved hardware speedup.

Ting Liu• 2026

Related benchmarks

TaskDatasetResultRank
Physical Commonsense ReasoningPIQA (val test)
Accuracy53.8
8
Language Modeling10-domain bilingual corpus (val)
Validation Perplexity8.88
7
Grade-school scienceARC Easy (full test val)
Accuracy32.1
3
Commonsense Sentence CompletionHellaSwag full (test val)
Accuracy26.5
3
Long-range word predictionLAMBADA full (test val)
Accuracy80.2
3
Science Question AnsweringSciQ full (test val)
Accuracy31.1
3
Showing 6 of 6 rows

Other info

Follow for update