Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Towards Lifelong Dialogue Agents via Timeline-based Memory Management

About

To achieve lifelong human-agent interaction, dialogue agents need to constantly memorize perceived information and properly retrieve it for response generation (RG). While prior studies focus on getting rid of outdated memories to improve retrieval quality, we argue that such memories provide rich, important contextual cues for RG (e.g., changes in user behaviors) in long-term conversations. We present THEANINE, a framework for LLM-based lifelong dialogue agents. THEANINE discards memory removal and manages large-scale memories by linking them based on their temporal and cause-effect relation. Enabled by this linking structure, THEANINE augments RG with memory timelines - series of memories representing the evolution or causality of relevant past events. Along with THEANINE, we introduce TeaFarm, a counterfactual-driven evaluation scheme, addressing the limitation of G-Eval and human efforts when assessing agent performance in integrating past memories into RG. A supplementary video for THEANINE and data for TeaFarm are at https://huggingface.co/spaces/ResearcherScholar/Theanine.

Kai Tzu-iunn Ong, Namyoung Kim, Minju Gwak, Hyungjoo Chae, Taeyoon Kwon, Yohan Jo, Seung-won Hwang, Dongha Lee, Jinyoung Yeo• 2024

Related benchmarks

TaskDatasetResultRank
Question AnsweringNarrativeQA (test)
ROUGE-L7.84
68
Dialogue Response GenerationChronicle
B-430.9
38
Dialogue Response GenerationMSC
B-4 Score33.9
38
Question AnsweringWikihop (test)
Accuracy19.75
32
Response GenerationChronicle and MSC Average
CEA54.1
30
Question AnsweringMerged QA HotpotQA, NarrativeQA, WikiHop (test)
Accuracy24.47
24
Question AnsweringHotpotQA (test)
Accuracy42.79
24
Event Correlation EvaluationChronicle, MSC, and LoCoMo Average
CEA49.8
12
Dialogue Response GenerationLocomo
BLEU-426.5
8
Instruction Following with Long-term MemoryHuman Evaluation 1-10 scale (test)
Coherence8
6
Showing 10 of 13 rows

Other info

Follow for update