Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Dialogue Memory Accuracy on LongMemEval-S (N=500)

91Temporal Accuracy

Hindsight

12.781633.088353.39573.7017Dec 14, 2025Jan 3, 2026Jan 24, 2026Feb 14, 2026Mar 7, 2026Mar 28, 2026Apr 18, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
2025.12
9187.294.997.196.48091.4-
2025.12
85.781.292.310098.286.789-
2025.12
8276.789.798.698.27085.2-
2025.12
81.275.287.297.110076.784.6-
2025.12
79.779.784.695.794.666.783.6-
2025.12
76.771.488.597.196.47081.6-
2026.01
67.1871.7483.1287.1432.1468.1869.81-
2026.01
64.6666.1778.2184.2982.1473.3372.4-
2025.12
62.457.983.392.980.456.771.2-
2026.04
50.3857.1478.21----78.85
2026.01
47.3648.8764.1192.8696.4346.6762.2-
2026.04
47.3648.8764.11----84.62
2025.12
45.144.378.281.494.62060.2-
2026.01
40.1546.2170.1281.4341.076053.51-
2026.04
40.1546.2170.12----62.82
2026.01
39.8548.4867.959098.2153.3360.9-
2026.04
39.8548.4867.95----85.9
2026.01
32.3331.0648.728064.293044.66-
2026.04
32.3331.0648.72----64.74
2025.12
31.621.160.338.680.42039-
2026.01
31.5845.4576.9287.1489.2936.6756.89-
2026.04
31.5845.4576.92----78.21
2026.01
15.7920.366.676046.436037.2-
2026.04
15.7920.366.67----55.13