Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Long-context understanding on LongBench-e

57.33Accuracy

Vanilla

42.551646.388350.22554.0617Apr 20, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
2026.04
57.33-------------------
2026.04
54.36-------------------
2026.04
52.23-------------------
2026.04
51.23-------------------
2026.04
49.23-------------------
2026.04
48.87-------------------
2026.04
47.12-------------------
2026.04
45.54-------------------
2026.04
44.94-------------------
2026.04
44.74-------------------
2026.04
44-------------------
2026.04
43.12-------------------
2025.06
-54.0499.8369.2786.2636.0343.675.6752.2860.6930.142221.9144.0853.52-----
2025.06
-53.910068.5887.5536.2242.7575.6752.2960.5130.172221.7943.753.47-----
2026.04
-59.8699.33-85.7554.52-65-51.076.97--44.1948.8751.1362.976.5732.9515
2026.04
-52.7183.67-84.7952.31-59.33-43.486.96--40.4443.9942.6357.676.5333.987.33
2026.04
-49.6499-80.8753.05-61.67-43.296.44--40.4144.1642.9151.586.4132.526.33
2026.04
-45.5499.33-85.0140.34-64.67-50.166.19--40.184544.4960.956.0127.1415
2026.04
-51.8997.67-85.2549.37-63.67-48.317--38.1446.1542.9861.546.730.8716.6
2026.04
-54.5898-84.2151.14-59-48.037.01--40.5846.7649.38616.4731.8216.67