Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Long-context language understanding on RULER (4k Context)
Loading...
100
Single-Key Score
Full Attention
95
97.5
100
102.5
Feb 4, 2026
Single-Key Score
Multi-Key Score
Multi-Value Score
Multi-Query Score
VT Score
FWE Rate
QA Score 1
QA Score 2
Average Score
Updated 4d ago
Evaluation Results
Method
Method
Links
Single-Key Score
Multi-Key Score
Multi-Value Score
Multi-Query Score
VT Score
FWE Rate
QA Score 1
QA Score 2
Average Score
Full Attention
Attention mechanism=Fu...
2026.02
100
89.6
87.8
79.2
17.4
10
79.8
56.4
63.7
LycheeDecode
Attention mechanism=Sp...
2026.02
100
89.4
88.4
78.9
17.3
10
80
56.2
63.7
Feedback
Search any
task
Search any
task