Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Long-context Memory Evaluation on BEHEMOTH LongMemEval (out-of-distribution)

63.07Accuracy

CluE

18.745230.252641.7653.2674Apr 13, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
2026.04
63.07
2026.04
56.82
2026.04
46.02
2026.04
35.06
2026.04
29.71
2026.04
20.45