Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Nemori: Self-Organizing Agent Memory Inspired by Cognitive Science

About

Large Language Models (LLMs) demonstrate remarkable capabilities, yet their inability to maintain persistent memory in long contexts limits their effectiveness as autonomous agents in long-term interactions. While existing memory systems have made progress, their reliance on arbitrary granularity for defining the basic memory unit and passive, rule-based mechanisms for knowledge extraction limits their capacity for genuine learning and evolution. To address these foundational limitations, we present Nemori, a novel self-organizing memory architecture inspired by human cognitive principles. Nemori's core innovation is twofold: First, its Two-Step Alignment Principle, inspired by Event Segmentation Theory, provides a principled, top-down method for autonomously organizing the raw conversational stream into semantically coherent episodes, solving the critical issue of memory granularity. Second, its Predict-Calibrate Principle, inspired by the Free-energy Principle, enables the agent to proactively learn from prediction gaps, moving beyond pre-defined heuristics to achieve adaptive knowledge evolution. This offers a viable path toward handling the long-term, dynamic workflows of autonomous agents. Extensive experiments on the LoCoMo and LongMemEval benchmarks demonstrate that Nemori significantly outperforms prior state-of-the-art systems, with its advantage being particularly pronounced in longer contexts.

Jiayan Nan, Wenquan Ma, Wenlong Wu, Yize Chen• 2025

Related benchmarks

TaskDatasetResultRank
Long-term memory evaluationLocomo
Overall F152.1
70
Multi-hop Question AnsweringLocomo
F144.2
67
Long-context Question AnsweringLocomo
Average F151.21
64
Long-context Memory RetrievalLocomo
Single-hop84.9
55
Single-hop Question AnsweringLocomo
F10.588
53
Open-domain Question AnsweringLocomo
F10.258
53
Long-context reasoning and retrievalLoCoMo (test)
Single-Hop F187.04
37
Temporal Question AnsweringLocomo
F10.5838
36
Long-context Memory EvaluationLongMemEval
Single-Turn Preference86.7
28
Long-term memory evaluationLongMemEval S (test)
KU (Knowledge Update)79.5
27
Showing 10 of 23 rows

Other info

Follow for update