Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Zero-shot Language Modeling on LM Evaluation Harness

80.66WG

LO-BCQ

54.483261.279168.07574.8709Feb 7, 2025
Updated 4d ago

Evaluation Results

MethodLinks
2025.02
80.6649.2884.9381.3465.1872.280.12
2025.02
79.9548.885.2381.5665.2772.16-
2025.02
78.774982.8281.4564.2171.250.91
2025.02
78.3749.2884.0381.4564.7671.580.58
2025.02
77.2747.8582.2979.8261.469.732.43
2025.02
76.448.0482.4180.5863.2470.132.03
2025.02
76.3247.7583.0680.5863.2470.191.97
2025.02
76.24--82.43---
2025.02
70.6440.6776.5479.1657.1164.82-
2025.02
70.1739.4377.0978.6256.664.380.44
2025.02
69.7742.5877.4377.0956.5164.680.97
2025.02
69.6140.5765.8177.254.8261.63.22
2025.02
69.3844.479.2978.0757.165.65-
2025.02
69.339.6275.3578.8956.6463.960.86
2025.02
69.1440.4875.4178.2456.0663.870.95
2025.02
68.942.4977.5877.0955.9364.41.25
2025.02
68.943.7377.8677.8656.5264.970.68
2025.02
67.9639.0472.2677.8654.7762.382.44
2025.02
67.8841.3468.3278.7854.1662.1-
2025.02
67.7239.4369.4577.7553.7161.610.49
2025.02
67.4841.347477.5354.2262.912.74
2025.02
67.0139.7165.3576.1250.2259.682.42
2025.02
6739.6269.377.3753.5161.360.74
2025.02
66.9340.8663.9176.2851.3859.872.23
2025.02
66.8540.4869.277.3153.0661.380.72
2025.02
66.2241.4373.9877.0455.1962.772.88
2025.02
65.1138.2866.2775.6350.7759.212.89
2025.02
64.1739.1469.6175.6847.659.245.58
2025.02
63.77--76.77---
2025.02
55.4931.3965.7567.343.5152.6912.96