Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Long-context language modeling on RULER

0.911RULER Score

GA-S2

0.4937520.6020760.71040.818724Dec 23, 2025
Updated 4d ago

Evaluation Results

MethodLinks
2025.12
0.911
2025.12
0.9061
2025.12
0.9051
2025.12
0.8949
2025.12
0.8869
2025.12
0.8733
2025.12
0.8717
2025.12
0.8704
2025.12
0.8631
2025.12
0.8619
2025.12
0.8584
2025.12
0.8541
2025.12
0.8533
2025.12
0.8444
2025.12
0.8423
2025.12
0.8394
2025.12
0.8347
2025.12
0.8259
2025.12
0.8257
2025.12
0.8231
2025.12
0.8226
2025.12
0.8209
2025.12
0.8205
2025.12
0.819
2025.12
0.8158
2025.12
0.8126
2025.12
0.8118
2025.12
0.8111
2025.12
0.8031
2025.12
0.7983
2025.12
0.7873
2025.12
0.7728
2025.12
0.7662
2025.12
0.7552
2025.12
0.7539
2025.12
0.7539
2025.12
0.7538
2025.12
0.7538
2025.12
0.7516
2025.12
0.7453
2025.12
0.7409
2025.12
0.7357
2025.12
0.7322
2025.12
0.7318
2025.12
0.7262
2025.12
0.7227
2025.12
0.7175
2025.12
0.716
2025.12
0.7152
2025.12
0.7075
2025.12
0.7065
2025.12
0.6997
2025.12
0.6991
2025.12
0.6953
2025.12
0.6907
2025.12
0.6904
2025.12
0.6786
2025.12
0.6752
2025.12
0.6749
2025.12
0.6743
2025.12
0.6707
2025.12
0.6657
2025.12
0.6626
2025.12
0.6617
2025.12
0.659
2025.12
0.6569
2025.12
0.6544
2025.12
0.6479
2025.12
0.6469
2025.12
0.6464
2025.12
0.6441
2025.12
0.6435
2025.12
0.6401
2025.12
0.639
2025.12
0.6385
2025.12
0.6329
2025.12
0.6274
2025.12
0.6051
2025.12
0.6023
2025.12
0.5965
2025.12
0.59
2025.12
0.5804
2025.12
0.5731
2025.12
0.5717
2025.12
0.5651
2025.12
0.5595
2025.12
0.558
2025.12
0.5557
2025.12
0.5552
2025.12
0.5512
2025.12
0.5417
2025.12
0.5408
2025.12
0.5408
2025.12
0.5359
2025.12
0.5355
2025.12
0.5229
2025.12
0.5214
2025.12
0.5138
2025.12
0.51
2025.12
0.5098
Showing 100 of 148 rows