Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Long-context Memory Evaluation on LongMemEval v0.6.0 (strict)
Loading...
92
Accuracy
TM Pro
64.96
71.98
79
86.02
May 6, 2026
Accuracy
Accuracy 95% CI (Lower Bound)
Correct Count
Updated 27d ago
Evaluation Results
Method
Method
Links
Accuracy
Accuracy 95% CI (Lower Bound)
Correct Count
TM Pro
Variant=oracle, Answer...
2026.05
92
89.43
460
TM Pro
Variant=3-run mean, st...
2026.05
87.8
84.64
439
RAG
Backbone=ChromaDB, Ans...
2026.05
87
83.77
435
EverMemOS
Answer model=gpt-4o
2026.05
83
79.4
-
Engram
Answer model=gpt-4.1-mini
2026.05
82.2
78.61
411
BM25
Answer model=gpt-4.1-mini
2026.05
81.6
77.97
408
Mem0
Answer model=gpt-4.1-mini
2026.05
66
61.74
330
Feedback
Search any
task
Search any
task