Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

LightMem: Lightweight and Efficient Memory-Augmented Generation

About

Despite their remarkable capabilities, Large Language Models (LLMs) struggle to effectively leverage historical interaction information in dynamic and complex environments. Memory systems enable LLMs to move beyond stateless interactions by introducing persistent information storage, retrieval, and utilization mechanisms. However, existing memory systems often introduce substantial time and computational overhead. To this end, we introduce a new memory system called LightMem, which strikes a balance between the performance and efficiency of memory systems. Inspired by the Atkinson-Shiffrin model of human memory, LightMem organizes memory into three complementary stages. First, cognition-inspired sensory memory rapidly filters irrelevant information through lightweight compression and groups information according to their topics. Next, topic-aware short-term memory consolidates these topic-based groups, organizing and summarizing content for more structured access. Finally, long-term memory with sleep-time update employs an offline procedure that decouples consolidation from online inference. On LongMemEval and LoCoMo, using GPT and Qwen backbones, LightMem consistently surpasses strong baselines, improving QA accuracy by up to 7.7% / 29.3%, reducing total token usage by up to 38x / 20.9x and API calls by up to 30x / 55.5x, while purely online test-time costs are even lower, achieving up to 106x / 117x token reduction and 159x / 310x fewer API calls. The code is available at https://github.com/zjunlp/LightMem.

Jizhan Fang, Xinle Deng, Haoming Xu, Ziyan Jiang, Yuqi Tang, Ziwen Xu, Shumin Deng, Yunzhi Yao, Mengru Wang, Shuofei Qiao, Huajun Chen, Ningyu Zhang• 2025

Related benchmarks

TaskDatasetResultRank
Multi-hop Question AnsweringHotpotQA
F1 Score37.68
294
Long-term memory evaluationLocomo
Overall F144.73
119
Long-context Question AnsweringLocomo
F1 (Multi Hop)32.11
109
Long-context Memory RetrievalLocomo
Single-hop76.61
70
Multi-hop Question AnsweringLocomo
F144.86
67
Open-domain Question AnsweringLocomo
F10.2619
53
Single-hop Question AnsweringLocomo
F10.5588
53
Long-context Memory EvaluationLongMemEval
Average Score67.5
52
Question AnsweringLocomo
Single Hop F141.79
38
Long-context reasoning and retrievalLoCoMo (test)
Single-Hop F181.21
37
Showing 10 of 44 rows

Other info

Follow for update