Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Dialogue Response Generation on LoCoMo
Loading...
29.1
BLEU-4
DialogLM + EventWeave
20.884
23.017
25.15
27.283
Mar 29, 2025
BLEU-4
ROUGE-L
Mauve
GPT-4 Score
Updated 8d ago
Evaluation Results
Method
Method
Links
BLEU-4
ROUGE-L
Mauve
GPT-4 Score
DialogLM + EventWeave
Domain Specialization=...
2025.03
29.1
43.4
66
7.8
GPT-4o + EventWeave
Memory Augmentation=Ev...
2025.03
28.4
42.6
65
7.6
GPT-4o + LifeLongMem
Memory Augmentation=Li...
2025.03
26.5
40.7
63
7
GPT-4o + MemWalker
Memory Augmentation=Me...
2025.03
25.7
40.1
62
6.7
GPT-4o + LongMem
Memory Augmentation=Lo...
2025.03
25.3
39.8
62
6.6
GPT-4o + ProactiveCoT
Memory Augmentation=Pr...
2025.03
24.9
39.2
61
6.3
DialogLM
Domain Specialization=...
2025.03
22.8
37.5
60
4.9
GPT-4o (vanilla)
Memory Augmentation=None
2025.03
21.2
35.8
58
4.3
Feedback
Search any
task
Search any
task