Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

MemoryAgentBench

Benchmarks

Task NameDataset NameSOTA ResultTrend
Memory Agent PerformanceMemoryAgentBench
Average Performance38.85
35
Fact Consolidation (Single-Hop)MemoryAgentBench (MAB) FC-SH 262K
Accuracy93
8
Long-context memory managementMemoryAgentBench
Single-Doc Recall76
8
Fact Consolidation (Multi-Hop)MemoryAgentBench (MAB) FC-MH 262K
Accuracy27
5
Fact Consolidation (Single-Hop)MemoryAgentBench FC-SH Average
Accuracy0.948
4
Fact Consolidation (Single-Hop)MemoryAgentBench (MAB) FC-SH 64K
Accuracy95
4
Fact Consolidation (Single-Hop)MemoryAgentBench (MAB) FC-SH 32K
Accuracy92
4
Fact Consolidation (Multi-Hop)MemoryAgentBench (MAB) FC-MH Average
Accuracy30.2
1
Fact Consolidation (Multi-Hop)MemoryAgentBench (MAB) FC-MH 64K
Accuracy33
1
Fact Consolidation (Multi-Hop)MemoryAgentBench FC-MH 32K
Accuracy27
1
Fact Consolidation (Multi-Hop)MemoryAgentBench (MAB) FC-MH 6K
Accuracy34
1
Showing 11 of 11 rows