Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Long-context language understanding on RULER 32k context length

87.5Average Score

Quest

6.06827.20948.3569.491Jan 20, 2026Jan 22, 2026Jan 25, 2026Jan 28, 2026Jan 31, 2026Feb 3, 2026Feb 6, 2026
Updated 4d ago

Evaluation Results

MethodLinks
2026.02
87.5----97.689.7855410099
2026.02
86.9----98.292.7815110099
2026.02
86.8----95.291.7845397100
2026.02
86.8----9588.3865310099
2026.02
86.2----96.492.381519899
2026.02
85.6----94.293.783499797
2026.02
85.5----94.288.7825395100
2026.02
85.3----97.89381519198
2026.02
83.8----91.487.784549096
2026.02
83.4----95.292.374519197
2026.02
83.1----94.287.782509392
2026.02
82.7----91.48682529392
2026.02
80.4----88.692.379508786
2026.02
72----90.292.368514982
2026.02
71.9----83.66477508374
2026.02
69.5----8683.373496462
2026.02
69.1----8189.775517345
2026.02
65.999.297.496.581.919.8070.651.6--
2026.02
64.1----768071547430
2026.02
59.497.88294.965.119.8066.249.6--
2026.02
49.4----72.283.76346320
2026.02
44.6----66.488.34844201
2026.02
44.1----82.891.73842100
2026.02
30.2----32.283352920
2026.02
24.8----083352920
2026.01
24.1----------
2026.02
16.2----061.7112410
2026.01
11.7----------
2026.01
9.3----------
2026.01
9.2----------