Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

LightMem: Lightweight and Efficient Memory-Augmented Generation

About

Despite their remarkable capabilities, Large Language Models (LLMs) struggle to effectively leverage historical interaction information in dynamic and complex environments. Memory systems enable LLMs to move beyond stateless interactions by introducing persistent information storage, retrieval, and utilization mechanisms. However, existing memory systems often introduce substantial time and computational overhead. To this end, we introduce a new memory system called LightMem, which strikes a balance between the performance and efficiency of memory systems. Inspired by the Atkinson-Shiffrin model of human memory, LightMem organizes memory into three complementary stages. First, cognition-inspired sensory memory rapidly filters irrelevant information through lightweight compression and groups information according to their topics. Next, topic-aware short-term memory consolidates these topic-based groups, organizing and summarizing content for more structured access. Finally, long-term memory with sleep-time update employs an offline procedure that decouples consolidation from online inference. On LongMemEval and LoCoMo, using GPT and Qwen backbones, LightMem consistently surpasses strong baselines, improving QA accuracy by up to 7.7% / 29.3%, reducing total token usage by up to 38x / 20.9x and API calls by up to 30x / 55.5x, while purely online test-time costs are even lower, achieving up to 106x / 117x token reduction and 159x / 310x fewer API calls. The code is available at https://github.com/zjunlp/LightMem.

Jizhan Fang, Xinle Deng, Haoming Xu, Ziyan Jiang, Yuqi Tang, Ziwen Xu, Shumin Deng, Yunzhi Yao, Mengru Wang, Shuofei Qiao, Huajun Chen, Ningyu Zhang• 2025

Related benchmarks

TaskDatasetResultRank
Multi-hop Question AnsweringHotpotQA
F1 Score37.68
221
Long-term memory evaluationLocomo
Overall F144.73
70
Multi-hop Question AnsweringLocomo
F144.86
67
Long-context Question AnsweringLocomo
Average F153.84
64
Long-context Memory RetrievalLocomo
Single-hop76.61
55
Open-domain Question AnsweringLocomo
F10.2619
53
Single-hop Question AnsweringLocomo
F10.5588
53
Long-context reasoning and retrievalLoCoMo (test)
Single-Hop F181.21
37
Temporal Question AnsweringLocomo
F10.6466
36
Long-context Memory EvaluationLongMemEval
Single-Turn Preference91.67
28
Showing 10 of 25 rows

Other info

Follow for update