Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

ES-Mem: Event Segmentation-Based Memory for Long-Term Dialogue Agents

About

Memory is critical for dialogue agents to maintain coherence and enable continuous adaptation in long-term interactions. While existing memory mechanisms offer basic storage and retrieval capabilities, they are hindered by two primary limitations: (1) rigid memory granularity often disrupts semantic integrity, resulting in fragmented and incoherent memory units; (2) prevalent flat retrieval paradigms rely solely on surface-level semantic similarity, neglecting the structural cues of discourse required to navigate and locate specific episodic contexts. To mitigate these limitations, drawing inspiration from Event Segmentation Theory, we propose ES-Mem, a framework incorporating two core components: (1) a dynamic event segmentation module that partitions long-term interactions into semantically coherent events with distinct boundaries; (2) a hierarchical memory architecture that constructs multi-layered memories and leverages boundary semantics to anchor specific episodic memory for precise context localization. Evaluations on two memory benchmarks demonstrate that ES-Mem yields consistent performance gains over baseline methods. Furthermore, the proposed event segmentation module exhibits robust applicability on dialogue segmentation datasets.

Huhai Zou, Tianhao Sun, Chuanjiang He, Yu Tian, Zhenyang Li, Li Jin, Nayu Liu, Jiang Zhong, Kaiwen Wei• 2026

Related benchmarks

TaskDatasetResultRank
Long-term memory evaluationLocomo
Overall F145.56
70
Long-context Question AnsweringLocomo
Average F145.56
64
Dialogue Memory AccuracyLongMemEval-S (N=500)
Temporal Accuracy64.66
17
Dialogue SegmentationDialSeg711
Pk0.172
14
Dialogue SegmentationSuperDialSeg
Pk0.434
10
Dialogue SegmentationTIAGE
Pk0.382
10
Showing 6 of 6 rows

Other info

Follow for update