Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Temporal-reasoning on LongMemEval S (test)
Loading...
15.03
F1 Score
HIPPOCAMPUS
3.3092
6.3521
9.395
12.4379
Feb 14, 2026
F1 Score
Accuracy
LLM-as-a-Judge Score
Updated 2d ago
Evaluation Results
Method
Method
Links
F1 Score
Accuracy
LLM-as-a-Judge Score
HIPPOCAMPUS
2026.02
15.03
17.29
2.15
MemOS
2026.02
11.23
12.97
1.94
MemoryOS
2026.02
9.78
11.24
1.83
A-mem
2026.02
8.28
9.51
1.72
MemGPT
2026.02
5.25
6.05
1.18
MemoryBank
2026.02
4.5
5.19
1.29
ReadAgent
2026.02
3.76
4.32
0.86
Feedback
Search any
task
Search any
task