Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

General Language Understanding on tinyBenchmark

77.51Accuracy (ARC)

No Steering

43.803652.554361.30570.0557Oct 5, 2025Oct 26, 2025Nov 17, 2025Dec 9, 2025Dec 31, 2025Jan 22, 2026Feb 13, 2026
Updated 19d ago

Evaluation Results

MethodLinks
2026.02
77.5187.7577.1261.4976.376.03-
2026.02
77.5187.7577.1261.4976.376.03-
2026.02
75.8686.8977.3161.8577.0575.79-
2026.02
75.1386.8977.3161.7874.0375.03-
2026.02
74.0886.6878.1161.877.3675.61-
2026.02
73.9682.7174.664.573.7773.91-
2026.02
73.9683.1474.8764.7375.9374.53-
2026.02
73.9682.7174.664.573.7773.91-
2025.10
73.9682.774.664.573.77-90.12
2025.10
73.4583.2476.1161.3675.47-86.91
2025.10
73.1582.2474.5564.0373.31-89.27
2025.10
72.9382.9575.6760.7475.06-86.6
2025.10
72.8482.8375.5160.8575.11-86.38
2025.10
72.7481.9474.363.0172.93-87.01
2026.02
72.6982.8974.6363.8772.6573.35-
2026.02
72.5282.7174.6363.4472.2373.11-
2025.10
72.481.5273.963.272.6-86.2
2025.10
72.1382.674.5263.673.04-88.96
2025.10
72.1381.7974.5959.4974.21-84.7
2025.10
71.0380.8273.958.7973.44-84
2026.02
69.3182.2276.655.0772.3571.11-
2026.02
69.3181.3574.951.8271.0269.68-
2026.02
69.3181.3475.8852.5672.5470.33-
2026.02
69.3182.2276.655.0772.3571.11-
2025.10
69.3182.3176.655.0772.34-83.19
2026.02
68.5580.4975.8852.5571.8669.87-
2026.02
68.3678.8872.5756.4175.270.28-
2026.02
68.3678.8872.5756.4175.270.28-
2025.10
68.3281.775.3353.1371.82-81.47
2026.02
68.2876.7772.4557.3275.2870.02-
2025.10
68.2181.4572.2951.8671.51-80.14
2025.10
67.9181.5974.8952.4971.42-79.24
2026.02
67.7179.5871.2257.3573.5969.89-
2025.10
67.581.271.151.171.15-79.2
2026.02
67.1576.8972.7657.976.6770.27-
2026.02
65.3382.5162.0254.3965.5765.96-
2026.02
65.3382.5162.0254.3965.5765.96-
2025.10
65.3382.5162.0254.3965.56-63.21
2026.02
65.1280.4664.0554.5365.5265.94-
2026.02
64.8680.1863.5554.6667.3466.12-
2025.10
64.2682.0161.3754.3365.21-61.85
2026.02
63.9580.4662.1354.5865.9365.41-
2026.02
62.4270.7866.4857.9966.0764.75-
2025.10
62.381.8761.5454.2464.93-61.99
2026.02
62.2973.1868.0356.4370.6566.12-
2026.02
62.2973.1868.0356.4370.6566.12-
2025.10
62.2973.1868.0356.4370.65-17.64
2025.10
62.0181.7360.9654.1764.81-60.57
2025.10
61.9572.466.1154.9569.85-14.8
2025.10
61.481.3560.253.764.45-60
2025.10
61.2872.7166.6254.7570.12-15.57
2025.10
61.272.5967.2954.169.72-16.01
2025.10
61.0572.0365.754.369.4-14.6
2026.02
60.7175.1376.9754.8173.1668.16-
2026.02
60.5972.6766.1958.3762.9364.15-
2026.02
60.4376.9177.454.9372.6668.47-
2026.02
59.8774.9966.0158.6359.863.86-
2026.02
59.7276.8677.455.0475.0668.82-
2026.02
59.1874.1763.7356.2270.6964.8-
2026.02
58.2873.4364.5455.4471.9964.74-
2026.02
57.7173.4365.9455.8470.7964.74-
2026.02
57.772.4462.4551.9563.7561.66-
2026.02
57.0272.4464.1951.9665.4762.22-
2026.02
56.9672.7764.7551.3265.7162.3-
2026.02
56.4476.960.6850.9157.5660.5-
2026.02
55.8675.9263.4850.1958.6460.82-
2026.02
55.8675.9263.4850.1958.6460.82-
2026.02
55.5177.1961.2151.0959.3360.87-
2026.02
55.1578.9463.1550.5858.3361.23-
2026.02
53.5279.4457.2347.369.5561.41-
2026.02
52.6879.4456.7847.4468.3360.93-
2026.02
51.7179.955.6147.4568.7960.69-
2026.02
50.3162.9441.942.759.0851.39-
2026.02
50.3162.8846.542.7959.2152.34-
2026.02
50.2161.758.5340.4754.1453.01-
2026.02
50.0361.743.3442.8160.2151.62-
2026.02
49.6863.5457.3340.7256.7553.6-
2026.02
48.6463.5457.0240.4755.753.07-
2026.02
46.1862.6761.9745.5564.1256.1-
2026.02
45.9962.6761.3645.9664.356.06-
2026.02
45.163.8161.8245.7364.556.19-