Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

COREY: A Prototype Study of Entropy-Guided Operator Fusion with Hadamard Reparameterization for Selective State Space Models

About

State Space Models (SSMs), represented by the Mamba family, provide linear-time sequence modeling and are attractive for long-context inference. Yet practical deployments remain memory-bandwidth limited because selective state updates are often decomposed into fragmented kernels with repeated intermediate tensor materialization. We present COREY, a prototype framework that combines memory-aware operator fusion with Hadamard-based feature reparameterization. Activation entropy, estimated with fixed-width histograms, is used as a runtime scheduling statistic to place fusion boundaries and choose tile sizes. To regularize heavy-tailed activations, we absorb normalized Hadamard transforms into linear projections, preserving functional equivalence while reducing peak-coordinate concentration. In a controlled prototype study over heavy-tailed SSM activations, COREY consistently reduces proxy latency, improves throughput, and lowers DRAM traffic relative to unfused and fixed-depth baselines. Low-bit results are reported only through a hand-crafted stability proxy and are intended as diagnostic evidence rather than checkpoint-level quality claims. Code repository: https://github.com/mabo1215/COREY_Transformer.git.

Bo Ma, Jinsong Wu, Hongjiang Wei, Weiqi Yan• 2026

Related benchmarks

TaskDatasetResultRank
Language ModelingWikiText-103
PPL809.4
189
Language ModelingPG-19
Perplexity11.66
160
Long-context Language UnderstandingLongBench 20 samples/task
NarrQA Performance1.91
4
Language Model InferenceSequence Bucket Short
Latency (ms)39.26
3
Language Model InferenceSequence Bucket Medium
Latency (ms)52.88
3
Language Model InferenceSequence Bucket Long
Latency (ms)69.58
3
Language Model InferenceSequence Bucket Ultra-long
Latency (ms)77.97
3
Showing 7 of 7 rows

Other info

Follow for update