Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Long-context Language Modeling on Composite Suite (MRCR v2, GraphWalks, LongBench v2, RULER, AA-LCR)

78.7Average Score

IndexCache

72.4674.0875.777.32Mar 12, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
2026.03
78.772.390.86697.367.2
2026.03
78.471.192.764.597.766.2
2026.03
78.172.890.265.197.664.6
2026.03
7870.890.363.797.667.6
2026.03
72.765.874.962.296.264.6