| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Memory Agent Performance | MemoryAgentBench | Average Performance38.85 | 35 | |
| Fact Consolidation (Single-Hop) | MemoryAgentBench (MAB) FC-SH 262K | Accuracy93 | 8 | |
| Long-context memory management | MemoryAgentBench | Single-Doc Recall76 | 8 | |
| Fact Consolidation (Multi-Hop) | MemoryAgentBench (MAB) FC-MH 262K | Accuracy27 | 5 | |
| Fact Consolidation (Single-Hop) | MemoryAgentBench FC-SH Average | Accuracy0.948 | 4 | |
| Fact Consolidation (Single-Hop) | MemoryAgentBench (MAB) FC-SH 64K | Accuracy95 | 4 | |
| Fact Consolidation (Single-Hop) | MemoryAgentBench (MAB) FC-SH 32K | Accuracy92 | 4 | |
| Fact Consolidation (Multi-Hop) | MemoryAgentBench (MAB) FC-MH Average | Accuracy30.2 | 1 | |
| Fact Consolidation (Multi-Hop) | MemoryAgentBench (MAB) FC-MH 64K | Accuracy33 | 1 | |
| Fact Consolidation (Multi-Hop) | MemoryAgentBench FC-MH 32K | Accuracy27 | 1 | |
| Fact Consolidation (Multi-Hop) | MemoryAgentBench (MAB) FC-MH 6K | Accuracy34 | 1 |