Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Long-context language understanding on RULER 128k

89.49Accuracy

RetroInfer

68.565273.997679.4384.8624Nov 18, 2025
Updated 19d ago

Evaluation Results

MethodLinks
2025.11
89.49-------------
2025.11
88.94-------------
2025.11
88.85-------------
2025.11
88.47-------------
2025.11
85.26-------------
2025.11
73.87-------------
2025.11
73.44-------------
2025.11
72.96-------------
2025.11
72.73-------------
2025.11
69.37-------------
2025.12
-49.118978.6794631629279082282112.8
2025.12
-17.3225768311.50470021141.6
2025.12
-42.3178.375.6791180.7528.598971820195.8
2025.12
-4271.3648545.2540.25429087271610.2