Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Long-context understanding on LongBench (All Core Tasks + Averages)

57.15HotpotQA

Full Cache

-2.036413.329328.69544.0607Dec 3, 2025Dec 24, 2025Jan 15, 2026Feb 6, 2026Feb 28, 2026Mar 22, 2026Apr 13, 2026
Updated 3d ago

Evaluation Results

MethodLinks
2026.02
57.1544.5432.8772.7254.327.2831.2218.55789-25.0163.0746.237592.2148.97-47.41------
2026.02
56.5145.5433.2253.7854.5325.9633.1131.186.539844.1425.1748.6343.697092.0847.6350.37-------
2026.02
55.445.6234.7755.1355.9726.929.4130.05109944.6725.1447.7943.247391.1647.9550.48-------
2026.02
55.455.1334.7759.855.9726.929.4131.051099-25.1453.9243.24-91.1650.3-44.67------
2026.02
54.9750.7435.7760.9454.327.5729.7319.03100-24.949.733.76-87.7449-46.04------
2026.02
54.7945.9834.9557.9652.3127.5825.3632.216.0299.43-24.2853.831.43-69.2846.8-46.2------
2025.12
54.7836.6533.3733.0853.2526.0225.3925594.540.1723.5937.4636.82-83.38----0.64-----
2026.02
54.7248.4835.4160.854.5427.3927.7931.587.98100-25.0448.835.7-86.9648.8-46.7------
2026.02
54.245.1634.1662.8550.127.0426.5633.498.63100-24.1351.7432.36-77.5147.4-45.36------
2026.02
54.0548.6335.2858.1254.6927.0224.3532.787.85100-25.0341.2222.11-86.1546.7-45.04------
2025.12
53.3537.0133.3533.3554.4526.0225.926.17596.540.1824.0834.0838.77-85.5---0-----
2026.02
52.6148.0134.9259.8153.6427.1726.1430.888.67100-24.5744.7514.87-81.8945.9-44.27------
2026.04
50.52---48.93--15.44--18.52----81.8343.05--------
2026.04
49.45---53.29--19.04--17.75----84.7944.86--------
2026.04
49.35---49.49--19.06--18.25----85.7244.37--------
2026.04
48.98---50.11--19.43--15.77----80.2542.91--------
2025.12
48.735.9333.4652.6152.7726.4125.1224.5638340.1723.3557.1626.37-82.39---------
2025.12
48.6737.2432.828.654.1226.7726.6322.963.587.539.3922.9837.2837.99-83.29----4.82-----
2026.04
48.45---48.94--18.89--15.78----82.3242.88--------
2026.04
48.39---47.37--15.83--14.84----73.2839.94--------
2026.04
48.27---45--17.19--15.58----83.7941.97--------
2026.04
48.25---48.97--15.6--16.15----83.4442.48--------
2026.04
47.84---52.16--14.08--13.85----80.7841.74--------
2026.04
47.84---48.43--18.44--14.7----85.9343.07--------
2026.04
47.77---49.9--17.37--14.12----83.2942.49--------
2026.04
45.68---48.33--15.65--13.17----83.1641.2--------
2026.04
44.53---47.81--19.42--17.11----81.8242.14--------
2026.04
44.19---50.14--18.39--13.3----84.9742.2--------
2026.02
43.9943.7132.1248.6447.3124.3925.2623.825.56946.7422.9547.6633.595583.9740.8543.21-------
2026.04
43.84---49.44--15.33--13.26----73.6339.1--------
2026.04
43.36---46.74--18.69--15.63----80.3240.95--------
2025.12
43.0633.3733.3551.8649.8226.5719.8218.212.9793.541.0719.5158.0223.15-86.38---------
2026.02
42.5542.1431.5948.2746.9224.9623.7324.695.56545.5923.2855.2646.567689.5143.2245.74-------
2026.04
42.09---36.24--12.94--9.04----63.0832.68--------
2026.04
39.88---45--7.37--17.35----78.8437.69--------
2026.04
38.97---39.55--13.7--7.95----64.6432.96--------
2026.02
35.1221.4232.8615.2534.8325.7818.9614.9512100-22.5212.0643.83-88.4936.9-26.48------
2026.02
34.8822.4331.9816.5534.6225.2119.0615.399100-22.611.5543.99-88.8237-26.31------
2026.04
34.58---44.04--8.99--16.57----79.3336.7--------
2026.02
33.4820.6831.8315.234.1525.3417.9713.617100-22.538.7642.97-88.2936.3-25.87------
2026.02
33.2322.4332.1113.9833.9824.7919.1313.167100-22.878.7642.69-89.1636.4-26.13------
2026.02
33.1823.0433.312.0435.7726.1817.3914.25798.67-23.411.5441.48-82.6536.1-28------
2026.02
32.9622.132.9711.6136.4925.6916.3813.51698.17-22.1810.9239.38-73.3534.1-24.95------
2026.04
32.95---42.46--8.5--14.04----75.2334.64--------
2026.04
32.59---42.3--11--19.28----80.9237.22--------
2026.04
32.57---43.2--8.73--18.21----78.3636.21--------
2026.04
32.5---43.08--9.8--19.36----78.4336.63--------
2026.02
32.4220.532.314.8233.8924.916.6913.998100-22.6610.0943.29-86.9636.1-25.13------
2026.02
31.0129.5428.3429.3540.9724.6114.1117.9835.319.6721.5636.0238.7961.579.8130.131.9-------
2026.02
30.2629.3429.4930.2941.2925.9614.6118.593.14521.0521.4625.4439.266278.8929.7531.53-------
2026.04
29.17---38.21--7.46--13.64----69.2131.54--------
2026.04
27.31---41.4--6.44--17.29----70.432.57--------
2026.03
15.2-17.667.631.23.9---15.912.1-64.4-72.589.630.72.2--19.220.817.78.233.9
2026.03
12-29.77232.11---26.825.2-69.2-73.59133.50.8--1320.4148.747.2
2026.03
11.8-2671.829.90.9---29.723.6-69.2-73.590.533.10.8--13.919.813.88.145.9
2026.03
11.5-23.771.731.20.9---27.824.1-68.5-7091.331.70.9--12.719.513.98.946.5
2026.02
11.4911.6330.5138.4131.0826.767.0510.113.3956-24.1734.1341.7972.591.6832.14-23.49------
2026.02
11.4411.5926.2932.1321.2325.858.252.512.285.67-23.934.7238.6769.588.1626.34-19.2------
2026.02
9.9510.2120.8932.4216.622.734.232.682.065-21.2136.436.166079.9823.73-19.2------
2026.03
9.9-24.256.227.719.4---4.720-54.3-5886.428.10.3--12.216.711.66.641.4
2026.03
9.8-17.47033.116---7.414.2-65.6-69.588.329.31.2--7.422.1125.230.3
2026.03
9.7-930.416.112.8---4.916.6-36.9-6078.721.22.8--4.69.97.45.233.7
2026.03
9.7-16.564.723.110.7---14.321.5-59.1-72.584.628.61.3--6.416.612.36.637.1
2026.02
9.5911.4927.4372.0320.392.126.12529.17-23.8268.5344.777590.7830.44-18.87------
2026.03
9.5-30.17035.528.3---723.8-66.1-69.587.232.30--10.321.612.86.838.3
2026.03
9.5-2970.636.727.5---723.1-64.4-69.587.232.20--10.720.614.56.638.5
2026.03
9.4-23.169.834.127.1---724.1-65.1-6587.231.40--10.521.612.67.238.4
2026.02
9.1211.6226.7555.4729.1317.955.2219.960.555.53-22.3552.4741.3271.587.9529.73-18.84------
2026.02
9.1110.4215.9820.3128.9310.895.5113.963.083.82-8.9325.1619.8965.583.921.36-16.37------
2026.02
9.089.7613.4947.1317.212.315.277.831.893.45-746.6926.037283.7223.09-16.5------
2026.02
8.979.5825.7333.5229.3910.814.418.1813.33-21.1341.740.3862.588.0326.05-18.1------
2026.02
7.959.4222.9634.824.8510.293.9819.80.874.16-21.0446.737.985184.9324.83-16.48------
2026.02
7.119.8518.8337.7819.2610.734.2915.482.643.22-19.4543.2229.0140.574.421.94-15.34------
2026.03
5.4-8.735.48.811.1---3.58.6-36.6-31.549.114.93.3--0.810.75.22.717.7
2026.02
5.188.5316.2243.6715.8218.413.742.632.954.43-17.3842.3122.6430.535.5417.71-13.46------
2026.02
3.167.3312.9838.1911.2315.872.211.31.333.45-7.4837.2910.534819.7114.58-13.17------
2026.02
3.165.3111.6517.219.398.582.031.082.962.95-3.7519.993.0341.512.179.43-6.09------
2026.02
1.611.710.7217.244.27.151.390.793.330.65-3.5315.153.269.54.195.51-3.73------
2026.02
1.472.477.3736.495.699.791.020.511.543.69-7.8431.844.7694.818.4-6.08------
2026.02
1.332.027.1119.094.575.390.670.71.294.58-3.1417.591.2418.54.796.11-5.69------
2026.02
0.681.283.1824.071.925.040.950.710.943.24-8.5118.357.2100.925.09-4.41------
2026.02
0.240.656.417.752.161.650.440.112.164.33-2.7415.810.4812.333.8-2.52------