Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Long-context language understanding on LongBench (12 specific metrics subset)

30.54NrtvQA

Palu

7.264813.307419.3525.3926Sep 25, 2025
Updated 1mo ago

Evaluation Results

MethodLinks
2025.09
30.5432.9121.3818.0714.76714.07393.145.145.8627.26
2025.09
29.565353.7646.1328.387.57.476.2599.519.8819.2233.7
2025.09
24.2528.3518.7216.5310.86654.28320.492.813.2317.96
2025.09
18.7934.4125.328.338.5206.221.55915.0817.3514.96
2025.09
17.1723.639.7110.225.303.863.167.15.084.778.18
2025.09
16.9532.821.3124.736.2805.342.252.6114.0616.7813.01
2025.09
16.8633.7623.4925.917.4205.263.143.7514.6516.7913.73
2025.09
16.7733.6223.326.387.3305.2841.516.0517.4813.79
2025.09
16.2831.1920.423.77.4403.911.61416.1620.8513.23
2025.09
15.9131.4819.9720.347.6905.762.012.2816.0621.1512.97
2025.09
15.7521.328.510.114.3404.142.356.586.485.317.72
2025.09
14.2127.5417.4119.775.110.255.653.010.513.4118.8411.43
2025.09
12.3448.8949.2644.6917.4799.5938921.7122.9229.81
2025.09
12.1148.8148.9444.5818.87107.88487.521.222.3529.66
2025.09
11.8149.6848.0643.5615.43109.46389.521.3622.4629.48
2025.09
11.6347.9946.9843.7118.8747.88487.521.5922.9728.46
2025.09
9.7343.9435.9534.4915.213.57.783.53318.2121.9420.66
2025.09
8.1639.3825.6322.7312.650.677.11.6212.518.7623.4815.7