Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Zero-shot Reasoning on (BoolQ, PIQA, HellaSwag, WinoGrande, ARC-e, ARC-c, OBQA)

82.813BoolQ Accuracy (Zero-shot)

Vicuna-13B-v1.3 (Original)

35.9370848.1067960.276572.44621Feb 26, 2025Apr 30, 2025Jul 2, 2025Sep 3, 2025Nov 5, 2025Jan 7, 2026Mar 12, 2026
Updated 29d ago

Evaluation Results

MethodLinks
2025.02
82.81378.34677.01771.11375.54747.61145.468.264
2025.02
81.176.7773.7268.3571.5142.494164.991
2025.02
81.0476.44274.84670.32471.80144.62541.865.84
2025.02
80.5579.05379.36772.13979.37749.14745.269.262
2025.02
77.0977.0974.4267.9667.842.3242.864.211
2026.03
75.7577.871.0267.6469.0240.8742.463.5
2025.02
75.53577.47673.57168.27272.1844.19843.264.919
2025.02
75.1773.9965.5467.5661.3236.7737.459.679
2025.02
74.52678.34672.42669.21969.73940.52943.263.998
2025.02
72.7874.6569.0768.3570.8340.614062.327
2025.02
71.31579.16274.83667.32473.48543.77141.664.499
2025.02
71.2569.5961.6864.459.0534.133656.586
2025.02
70.58176.65967.31765.27263.25835.15440.859.863
2025.02
69.6377.4874.7567.0173.4844.114464.351
2026.03
69.5175.6867.5164.0966.3738.4840.260.26
2025.02
69.4574.91867.44763.37866.41438.31137.459.617
2026.03
68.9979.0576.669.7773.3245.564265.04
2025.02
68.53274.10266.42164.00967.59341.12638.660.055
2025.02
68.31876.27975.20471.11374.7946.67242.464.968
2026.03
67.875.6366.763.4666.9638.054159.94
2026.03
67.3174.1665.2161.866.9638.3139.859.08
2026.03
66.9773.2962.1261.9761.4936.863957.39
2026.03
66.8872.4763.1764.4863.5534.8139.857.88
2026.03
66.1869.8657.5662.3559.9733.1138.455.35
2025.02
66.08676.06167.78559.19567.38239.33441.859.663
2026.03
64.3473.6762.5560.4664.8137.6340.457.69
2025.02
64.00677.42176.36971.66576.55748.54943.865.481
2025.02
63.33372.30767.12863.69466.66739.7613758.556
2025.02
63.21176.06170.11665.4370.74939.93239.260.671
2026.03
62.6665.0248.9557.8547.4331.483750.06
2026.03
62.3934.1253.3535.8227.823443.96-
2025.02
62.38570.6258.41555.64362.536.68933.854.293
2025.02
62.38570.6258.41555.64362.536.68933.854.293
2026.03
62.2667.0852.8354.750.0431.9137.650.92
2026.03
62.0852.9836.952.2538.0527.3934.843.49
2026.03
61.9967.9554.1754.6253.0333.4539.652.12
2026.03
61.6853.4826.4849.6427.7424.9130.639.22
2026.03
61.6561.4342.0152.0138.2228.2435.645.59
2025.02
61.6571.2263.9661.457.4935.323755.434
2026.03
61.1653.0526.6549.492926.1929.639.31
2026.03
60.8356.6427.3850.1229.1223.0431.239.76
2025.02
59.93973.28668.75165.98368.98141.7243959.666
2025.02
59.0255.1133.5852.829.8424.9128.640.551
2026.03
58.6562.5141.5852.5743.2729.4435.246.17
2026.03
56.6456.5830.4550.2837.9225.8533.241.56
2026.03
53.7963.9843.6754.8545.4529.443646
2026.03
52.0865.3444.0553.5952.6929.0135.247.42
2026.03
51.2856.0430.7751.7832.0328.3232.240.35
2026.03
51.1652.6726.0947.5126.627.3929.237.23
2026.03
50.9875.8464.1459.8366.1238.654056.51
2026.03
44.0171.3357.8556.5162.2137.1238.852.55
2026.03
43.9473.4556.1357.358.8934.338.251.74
2026.03
38.563.2241.2253.2846.9729.3534.643.88
2025.02
37.82951.03425.50350.90826.22127.21827.435.159
2026.03
37.7453.2325.9548.7827.4828.9229.635.96