HiMem: Hierarchical Long-Term Memory for LLM Long-Horizon Agents

About

Although long-term memory systems have made substantial progress in recent years, they still exhibit clear limitations in adaptability, scalability, and self-evolution under continuous interaction settings. Inspired by cognitive theories, we propose HiMem, a hierarchical long-term memory framework for long-horizon dialogues, designed to support memory construction, retrieval, and dynamic updating during sustained interactions. HiMem constructs cognitively consistent Episode Memory via a Topic-Aware Event--Surprise Dual-Channel Segmentation strategy, and builds Note Memory that captures stable knowledge through a multi-stage information extraction pipeline. These two memory types are semantically linked to form a hierarchical structure that bridges concrete interaction events and abstract knowledge, enabling efficient retrieval without sacrificing information fidelity. HiMem supports both hybrid and best-effort retrieval strategies to balance accuracy and efficiency, and incorporates conflict-aware Memory Reconsolidation to revise and supplement stored knowledge based on retrieval feedback. This design enables continual memory self-evolution over long-term use. Experimental results on long-horizon dialogue benchmarks demonstrate that HiMem consistently outperforms representative baselines in accuracy, consistency, and long-term reasoning, while maintaining favorable efficiency. Overall, HiMem provides a principled and scalable design paradigm for building adaptive and self-evolving LLM-based conversational agents. The code is available at https://github.com/jojopdq/HiMem.

Ningning Zhang, Xingxing Yang, Zhizhong Tan, Weiping Deng, Wenyong Wang• 2026

Related benchmarks

Task	Dataset	Result
Multi-hop Question Answering	Locomo	F133.8	125
Single-hop Question Answering	Locomo	F10.491	111
Open-domain Question Answering	Locomo	F10.219	111
Temporal Question Answering	Locomo	F10.468	85
Role-playing Quality Evaluation	RoleMemo (test)	Information Richness4.05	14
Memory Construction Quality	RoleMemo 1.0 (test)	Interpretive Attribution Fact (Recall@10)37	10
Memory Management	Heavy admitted-content	Recall100	6
Memory Management and Retrieval	Heavy admitted-content (full matrix)	Recall100	6

Showing 8 of 8 rows

Other info

Follow for update

@wizwand_team Discord