Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Lngram: N-gram Conditional Memory in Latent Space

About

Sequence modeling requires both compositional reasoning and local static knowledge retrieval, yet standard Transformers handle both through dense computation. Engram partially decouples retrieval from the backbone, but its token-based keys remain tied to text tokenization and hash compression. We propose Lngram, a latent-space conditional memory module that learns discrete symbols directly from hidden states and performs N-gram lookup over these symbols. This design removes the dependence on tokenizer IDs and naturally extends to non-text modalities. In our evaluated settings, Lngram outperforms Transformer and Engram baselines, consistently reduces perplexity in long-context language modeling, and effectively injects domain knowledge when added post hoc to pretrained models. Joint training with the backbone further surpasses full fine-tuning, while experiments on vision-language and vision-language-action tasks show overall gains. Analyses with LogitLens and CKA suggest that Lngram enables prediction-relevant information to emerge earlier, increasing effective depth with limited inference and memory overhead. Code is available at https://github.com/zyaaa-ux/Lngram.

Yunao Zheng, Guoyang Xia, Xiaojie Wang, Lei Ren• 2026

Related benchmarks

TaskDatasetResultRank
Commonsense ReasoningWinoGrande
Accuracy54.38
1442
Robot ManipulationLIBERO (test)
Average Success Rate98.5
220
Multi-task Language UnderstandingMMLU
Accuracy26.19
136
Physical Commonsense ReasoningPIQA
Accuracy (PIQA)69.26
99
Common Sense ReasoningHellaSwag
Accuracy44.81
23
Science Question AnsweringSciQ
Accuracy71
16
Multimodal UnderstandingSEED-Bench (val)--
16
Domain Knowledge InjectionBDD driving exam
Accuracy62.45
4
Domain Knowledge InjectionCNK driving exam dataset
Accuracy81.02
4
Showing 9 of 9 rows

Other info

Follow for update