Share your thoughts, 1 month free Claude Pro on usSee more

Long-context retrieval and reasoning on RULER 11 tasks average

99.34Context Length 4K Performance

Full

Updated 2mo ago

Evaluation Results

Method	Links
Full 2026.05		99.34	98.83	98.55	94.89	89.85	79.32	93.46
ReST-KV 2026.05		94.01	86.66	84.12	81.87	78.65	68.28	82.27
SnapKV 2026.05		83.6	75.54	71.12	66.95	57.47	47.99	67.11
PyramidKV 2026.05		81.35	73.66	70.23	69.83	57.84	48.93	66.97
Streaming 2026.05		39.81	18.42	12.1	10.57	9.91	8.18	16.5