Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Augmenting Language Models with Long-Term Memory

About

Existing large language models (LLMs) can only afford fix-sized inputs due to the input length limit, preventing them from utilizing rich long-context information from past inputs. To address this, we propose a framework, Language Models Augmented with Long-Term Memory (LongMem), which enables LLMs to memorize long history. We design a novel decoupled network architecture with the original backbone LLM frozen as a memory encoder and an adaptive residual side-network as a memory retriever and reader. Such a decoupled memory design can easily cache and update long-term past contexts for memory retrieval without suffering from memory staleness. Enhanced with memory-augmented adaptation training, LongMem can thus memorize long past context and use long-term memory for language modeling. The proposed memory retrieval module can handle unlimited-length context in its memory bank to benefit various downstream tasks. Typically, LongMem can enlarge the long-form memory to 65k tokens and thus cache many-shot extra demonstration examples as long-form memory for in-context learning. Experiments show that our method outperforms strong long-context models on ChapterBreak, a challenging long-context modeling benchmark, and achieves remarkable improvements on memory-augmented in-context learning over LLMs. The results demonstrate that the proposed method is effective in helping language models to memorize and utilize long-form contents. Our code is open-sourced at https://aka.ms/LongMem.

Weizhi Wang, Li Dong, Hao Cheng, Xiaodong Liu, Xifeng Yan, Jianfeng Gao, Furu Wei• 2023

Related benchmarks

TaskDatasetResultRank
Dialogue Response GenerationMSC
B-4 Score33.3
38
Dialogue Response GenerationChronicle
B-429.9
38
Response GenerationChronicle and MSC Average
CEA47.3
30
Event Correlation EvaluationChronicle, MSC, and LoCoMo Average
CEA43.2
12
Dialogue Response GenerationLocomo
BLEU-425.3
8
Instruction Following with Long-term MemoryHuman Evaluation 1-10 scale (test)
Coherence7.7
6
Generation and retrievalMiSC multi-speaker
Coherence71
3
Showing 7 of 7 rows

Other info

Follow for update