Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Long-context understanding on LongBench (Comprehensive Metric Set)

48.37Average Score

Llama3.1-8B

30.440435.095239.7544.4048Feb 6, 2025Mar 17, 2025Apr 25, 2025Jun 3, 2025Jul 12, 2025Aug 20, 2025Sep 28, 2025
Updated 4d ago

Evaluation Results

MethodLinks
2025.09
48.3740.8851.7258.4752.6835.842.5230.6424.4447.192.83730.53968.5667.48
2025.02
48.17---------------
2025.02
48---------------
2025.09
47.9133.7651.457.1756.7629.438.0931.5724.446.7392.572266.562.8653.53
2025.02
47.79---------------
2025.09
47.6731.0950.5256.9555.6129.4138.2731.2523.1646.5392.572.536862.7253.5
2025.09
47.6239.1449.9353.4552.2934.442.9831.0523.8947.293.0872.50.538.568.3166.99
2025.09
47.3240.1751.1655.7352.8335.5242.7828.2124.3145.692.88720.532.6668.6466.86
2025.09
47.2131.6850.5656.2656.1128.1937.6428.9524.5146.4492.5721.56662.3653.47
2025.09
46.835.9450.9654.1350.5430.9541.8327.523.0646.2392.83730.53968.4367
2025.02
46.23---------------
2025.09
46.1628.8850.3750.4751.624.5537.7630.3622.2746.8491.5722.567.562.6853.07
2025.02
46.1---------------
2025.09
45.9536.2351.1651.8147.727.8340.8527.1221.8446.8992.83720.537.568.566.54
2025.09
45.1426.4350.1850.2947.2522.8336.3428.3121.1446.7191.571.51.565.562.8852.84
2025.02
44.2---------------
2025.02
43.99---------------
2025.02
43.77---------------
2025.02
43.74---------------
2025.02
43.46---------------
2025.02
43.3---------------
2025.02
42.85---------------
2025.02
40.22---------------
2025.02
39.77---------------
2025.02
39.52---------------
2025.02
31.13---------------