Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Classification on Zero-shot Evaluation Suite (BoolQ, PIQA, HellaSwag, WinoGrande, ARC-e, ARC-c, OBQA)

69.99Average Accuracy (Zero-Shot Suite)

Dense

33.163642.724352.28561.8457May 28, 2023Nov 12, 2023Apr 29, 2024Oct 15, 2024Apr 2, 2025Sep 18, 2025Mar 6, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
2026.03
69.9981.3880.7979.1672.6177.7453.2445
2026.03
69.5782.2380.5878.5674.5176.4350.5144.2
2026.03
69.4382.1480.0978.5474.0376.650.4344.2
2026.02
69.31-------
2024.03
68.5976.579.876.170.172.847.657.2
2026.03
68.5382.147976.6272.6174.3750.3444.6
2026.03
68.5182.4279.4376.7173.2474.8750.0942.8
2026.02
66.85-------
2026.02
66.79-------
2026.03
65.5878.8777.273.1972.4570.9245.6540.8
2026.03
65.1878.4177.6472.9272.8568.7344.1141.6
2026.02
64.39-------
2026.02
64.34-------
2026.02
64.14-------
2026.02
63.52-------
2023.05
63.2573.1878.3572.9967.0167.4541.3842.4
2024.03
63.2573.1878.3572.9967.0167.4541.3842.4
2026.02
63.21-------
2026.02
63.21-------
2026.02
62.79-------
2026.02
62.54-------
2026.02
62.54-------
2024.03
62.4468.5377.870.5867.4970.2440.4442
2026.02
62.32-------
2024.03
62.0869.6376.8271.268.3569.9139.2539.4
2026.02
60.87-------
2026.02
60.6-------
2023.05
60.0565.6279.317062.7665.8737.6939.14
2024.03
60.0565.6279.317062.7665.8737.6939.14
2026.02
59.82-------
2026.02
59.62-------
2024.03
59.5179.0875.4653.4467.868.6437.9734.2
2023.05
59.4665.3776.6569.4163.7865.4536.1239.5
2026.02
59.33-------
2023.05
59.2364.6277.268.863.1464.3136.7739.8
2026.02
58.97-------
2026.03
58.6175.7872.5261.7968.4359.7234.6437.4
2026.02
58.59-------
2026.02
58.31-------
2026.02
58.06-------
2026.02
57.8-------
2026.02
57.79-------
2024.03
57.6657565.757.967.0636.635.8
2026.03
57.5276.3972.261.6166.6957.5833.1935
2024.03
57.3471.1375.2451.5867.5668.9836.0930.8
2023.05
57.2365.7574.764.5259.3560.6536.2639.4
2024.03
57.2365.7574.764.5259.3560.6536.2639.4
2026.02
56.08-------
2026.02
55.96-------
2026.02
55.88-------
2026.02
54.88-------
2026.02
54.26-------
2026.02
54.19-------
2023.05
53.5961.8970.8158.3456.8754.8734.0238.4
2024.03
53.5961.8970.8158.3456.8754.8734.0238.4
2026.02
53.44-------
2026.02
52.98-------
2026.02
52.56-------
2024.03
51.860.5572.3655.2555.0950.8431.4837
2024.03
51.2464.5269.943.2964.9561.8630.3723.8
2026.02
50.39-------
2024.03
50.3760.2167.5252.1457.5449.6629.9535.6
2026.02
50.11-------
2023.05
49.7161.8871.5347.8655.0145.1331.6234.98
2024.03
49.7161.8871.5347.8655.0145.1331.6234.98
2023.05
49.5661.4370.8847.6555.1245.7830.535.62
2024.03
49.5661.4370.8847.6555.1245.7830.535.62
2026.02
49.34-------
2024.03
48.9861.4768.8247.5655.0946.4628.2435.2
2024.03
48.9861.4768.8247.5655.0946.4628.2435.2
2023.05
48.6960.2869.3147.0653.4345.9629.1835.6
2026.02
47.93-------
2026.02
47.88-------
2026.02
47.45-------
2026.02
47.19-------
2026.02
46.77-------
2026.02
46.74-------
2026.02
46.72-------
2024.03
46.6162.1164.9640.5251.5446.3828.3332.4
2026.02
46.26-------
2026.02
46.14-------
2026.03
46.0768.6561.740.657.1440.7424.8328.8
2026.03
4668.7861.1541.1255.8840.1125.3429.6
2023.05
45.4350.957.3838.1255.9842.6834.238.78
2024.03
45.4350.957.3838.1255.9842.6834.238.78
2026.02
44.87-------
2026.02
42.46-------
2026.02
42.4-------
2023.05
40.4247.454.3633.4953.137.8826.630.12
2024.03
40.4247.454.3633.4953.137.8826.630.12
2026.03
38.2756.6753.4828.3650.5930.0121.7627
2026.03
37.353.7653.4828.3749.2530.320.3125.6
2026.03
35.3437.8352.9426.9951.328.0723.0427.2
2026.03
34.5838.0751.5227.0748.6228.4923.2925