Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

PersonaMem

Benchmarks

Task NameDataset NameSOTA ResultTrend
Query-AnsweringPersonaMem 128K context length
Query-Answering Accuracy70
60
Query-AnsweringPersonaMem 32K context length
Query-Answering Accuracy90
60
Query-AnsweringPersonaMem 1M context length
Query-Answering Accuracy72
38
Personalized Dialogue Response GenerationPersonaMem 1.0
Overall Score76.06
33
Memory-Augmented DialoguePersonaMem v1.0 (test)
Overall Score74.53
28
Multiple-choice Query AnsweringPersonaMem (Average)
Accuracy72
22
Long-context Memory Retrieval and ReasoningPersonaMem 128K
F1 Score23.75
20
Long-context Memory Retrieval and ReasoningPersonaMem 32K
F1 Score26.45
20
Privacy ExtractionPersonaMem v2 (test)
F1 Score0.9448
18
Response SelectionPersonaMem
Accuracy64.36
16
Agentic Memory ManagementPersonaMem
Preference Recall69.7
11
Preference evolution over long multi-session historiesPersonaMem 128K context scale
Accuracy47.24
8
Preference evolution over long multi-session historiesPersonaMem 32K context scale
Accuracy57.06
8
Personality-based MemoryPERSONAMEM 32k context
Accuracy57.93
8
Personalized Memory RetrievalPersonaMem
Precision58.9
8
Persona-based memory dialoguePersonaMem
Normalized Score65.2
5
Personalized GenerationPersonaMem 128K memory corpus 1.0 (test)
Revisit Reasons81.41
5
Personalized GenerationPersonaMem 32K memory corpus 1.0 (test)
Revisit Reasons94.95
5
Memory RetrievalPersonaMem 128K
NDCG@143.5
5
Memory RetrievalPersonaMem
Recall22.9
4
Personalized memory reasoningPERSONAMEM
Recall User Shared Facts67.81
4
Personalized GenerationPersonaMem 1M memory corpus 1.0 (test)
Revisit Reasons77.87
4
Long-term Text MemoryPersonaMem
Accuracy46.3
2
Showing 23 of 23 rows