Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Question Answering on LongBench 48 items median 11K tokens
Loading...
39.6
F1 Score
Full cache
9.648
17.424
25.2
32.976
May 18, 2026
F1 Score
Ceiling Rate
P-Value
Updated 14d ago
Evaluation Results
Method
Method
Links
F1 Score
Ceiling Rate
P-Value
Full cache
Cache Capacity (C)=819...
2026.05
39.6
100
-
H2O + prot
Cache Capacity (C)=102...
2026.05
34.5
87.2
0.445
SnapKV + prot
Cache Capacity (C)=102...
2026.05
34.5
87.2
0.445
LRU + prot
Cache Capacity (C)=102...
2026.05
33.6
85
-
LRU + prot
Cache Capacity (C)=204...
2026.05
33.4
84.4
-
LRU + prot
Cache Capacity (C)=409...
2026.05
31.4
79.4
-
Random + prot
Cache Capacity (C)=409...
2026.05
30
75.8
0.673
LRU (no prot)
Cache Capacity (C)=102...
2026.05
20.8
52.7
0.004
LRU (no prot)
Cache Capacity (C)=204...
2026.05
18.7
47.3
0.002
LRU (no prot)
Cache Capacity (C)=409...
2026.05
10.8
27.2
0.001
Feedback
Search any
task
Search any
task