Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Long-context language understanding on LongBench Llama-3.2-1B-Instruct (test)

16.12NQA

Baseline

10.389611.877313.36514.8527Feb 6, 2026
Updated 4d ago

Evaluation Results

MethodLinks
2026.02
16.1226.5442.4129.3131.2714.8330.1421.6225.7832.5170.197.382428.2527.1
2026.02
15.6726.5639.6431.5330.4710.8328.4621.4525.334172.496.934431.227.5
2026.02
15.1924.8637.9627.8627.7911.1928.2921.5225.0539.1569.547.553431.1926.5
2026.02
14.5227.6138.2228.832.878.828.0121.5624.7839.3968.6910.064529.7327
2026.02
14.3225.9337.8427.7331.488.5526.1620.9624.541.6969.4612.933529.4326.8
2026.02
14.1228.1639.4431.5729.28.5227.1820.6224.1941.5972.528.554430.827.1
2026.02
10.6123.1835.1329.4723.2910.8826.9520.3824.3241.4662.8210.52432.3325.3