| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Query-Answering | PersonaMem 128K context length | Query-Answering Accuracy70 | 60 | |
| Query-Answering | PersonaMem 32K context length | Query-Answering Accuracy90 | 60 | |
| Query-Answering | PersonaMem 1M context length | Query-Answering Accuracy72 | 38 | |
| Personalized Dialogue Response Generation | PersonaMem 1.0 | Overall Score76.06 | 33 | |
| Memory-Augmented Dialogue | PersonaMem v1.0 (test) | Overall Score74.53 | 28 | |
| Multiple-choice Query Answering | PersonaMem (Average) | Accuracy72 | 22 | |
| Long-context Memory Retrieval and Reasoning | PersonaMem 128K | F1 Score23.75 | 20 | |
| Long-context Memory Retrieval and Reasoning | PersonaMem 32K | F1 Score26.45 | 20 | |
| Privacy Extraction | PersonaMem v2 (test) | F1 Score0.9448 | 18 | |
| Response Selection | PersonaMem | Accuracy64.36 | 16 | |
| Agentic Memory Management | PersonaMem | Preference Recall69.7 | 11 | |
| Preference evolution over long multi-session histories | PersonaMem 128K context scale | Accuracy47.24 | 8 | |
| Preference evolution over long multi-session histories | PersonaMem 32K context scale | Accuracy57.06 | 8 | |
| Personality-based Memory | PERSONAMEM 32k context | Accuracy57.93 | 8 | |
| Personalized Memory Retrieval | PersonaMem | Precision58.9 | 8 | |
| Persona-based memory dialogue | PersonaMem | Normalized Score65.2 | 5 | |
| Personalized Generation | PersonaMem 128K memory corpus 1.0 (test) | Revisit Reasons81.41 | 5 | |
| Personalized Generation | PersonaMem 32K memory corpus 1.0 (test) | Revisit Reasons94.95 | 5 | |
| Memory Retrieval | PersonaMem 128K | NDCG@143.5 | 5 | |
| Memory Retrieval | PersonaMem | Recall22.9 | 4 | |
| Personalized memory reasoning | PERSONAMEM | Recall User Shared Facts67.81 | 4 | |
| Personalized Generation | PersonaMem 1M memory corpus 1.0 (test) | Revisit Reasons77.87 | 4 | |
| Long-term Text Memory | PersonaMem | Accuracy46.3 | 2 |