Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Long-context evaluation on LongBench v2
Loading...
31.2
Overall Score
CSAttention
26
27.35
28.7
30.05
Mar 30, 2026
Overall Score
Easy Score
Hard Score
Short Context Score
Medium Context Score
Long Context Score
Updated 5d ago
Evaluation Results
Method
Method
Links
Overall Score
Easy Score
Hard Score
Short Context Score
Medium Context Score
Long Context Score
CSAttention
Backbone=Llama-3.1-8B
2026.03
31.2
34.4
29.3
37.8
25.1
32.4
Full
Backbone=Llama-3.1-8B
2026.03
31
35.4
28.3
37.2
26
30.6
H2O
Backbone=Llama-3.1-8B
2026.03
29.9
32.9
28
33.8
27.9
31.5
PQCache
Backbone=Llama-3.1-8B
2026.03
29.8
33.3
27.7
37.8
22.3
31.5
MagicPig
Backbone=Llama-3.1-8B
2026.03
29.2
29.5
29
31.8
26.9
29.4
SparQ
Backbone=Llama-3.1-8B
2026.03
26.2
27.6
25.4
30
22.3
27.8
Feedback
Search any
task
Search any
task