Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Long-context Reasoning on LongBench v2

68.2Average Score

Gemini 3 Pro

10.916825.788440.6655.5316Jan 24, 2026Jan 28, 2026Feb 1, 2026Feb 5, 2026Feb 9, 2026Feb 13, 2026Feb 17, 2026
Updated 4d ago

Evaluation Results

MethodLinks
68.2-------
2026.02
64.5-------
64.4-------
2026.02
61-------
59.8-------
59.8-------
2026.02
59.1-------
2026.02
36.2338.7532.6739.1241.1533.2--
2026.02
35.4438.7537.2126.3935.1635.61--
2026.01
33.41---37.3331.20.66-
2026.02
33.3835.6528.7634.6540.6228.93--
2026.02
32.8437.9328.5732.2139.928.49--
2026.01
32.69---3233.08--
2026.02
32.4136.6728.3733.33----
2026.01
31.97---39.3327.82--
2026.01
31.73---38.6727.820.7-
2026.02
31.4635.1628.131.4634.729.46--
2026.02
31.0141.1126.0524.07----
2026.01
31.01---40.6725.560.7-
2026.01
30.77---30.6730.830.75-
2026.01
30.74---30.6830.770.78-
2026.02
30.4133.3331.3623.8728.3231.63--
2026.02
29.2232.2228.8425----
2026.01
29.09---29.3228.67--
2026.01
29.09---29.3228.67--
2026.01
29.09---2829.70.68-
2026.01
28.61---2828.920.7-
2026.01
28.12---3225.940.72-
2026.01
27.88---3424.440.7-
2026.01
27.88---28.6727.440.7-
2026.02
27.4331.1125.5825----
2026.02
27.2427.7823.2634.26----
2026.02
27.0427.2223.2634.26----
2026.02
27.0429.4421.434.26----
2026.02
26.6432.0224.4523.3528.2525.65--
2026.02
26.4426.6729.7719.44----
2026.02
26.2432.2224.6519.44----
2026.01
25.96---32.6722.18--
2026.01
25.96---28.6724.440.7-
2026.02
25.6532.9322.5319.6421.5428.1--
2026.01
25.48---2824.06--
2026.01
24.35---27.622.330.7-
2026.02
22.8617.7826.0525----
2026.02
20.872516.2823.15----
2026.02
20.6826.1113.9525----
2026.02
20.5827.9317.8614.0819.6721.14--
2026.02
17.1320.7113.6917.7819.5915.62--
2026.02
13.1213.8914.8810.56----
2026.01
-------39.4
2026.01
-------30
2026.01
-------40.2
2026.01
-------40.9
2026.01
-------40.7
2026.01
-------38.9
2026.01
-------39.4
2026.01
-------36.6
2026.01
-------38.8
2026.01
-------39.4
2026.01
-------37.4