Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

General Language Understanding and Reasoning on Open LLM Leaderboard Lighteval (test)

91.07Mean Accuracy

GPT-5

55.616464.820774.02583.2293Jan 29, 2026
Updated 4d ago

Evaluation Results

MethodLinks
2026.01
91.0791.495.3191.3694.8587.187.8589.6
2026.01
74.3377.0771.7682.7179.8752.4773.0183.44
2026.01
73.3477.3869.6286.057947.7573.6479.97
2026.01
73.1782.1169.2887.4970.9348.1765.9888.26
2026.01
70.8678.7368.0981.7379.6243.8473.16-
2026.01
69.5878.1865.1982.3477.7642.4471.59-
2026.01
69.3178.6366.7281.1279.2638.0972.06-
2026.01
69.2864.1163.9177.7981.3553.1568.5176.17
2026.01
69.2377.865.5382.0377.9642.1969.85-
2026.01
68.267.2957.5177.4178.9145.9372.6177.75
2026.01
66.7166.1753.0775.2879.0746.5273.2473.58
2026.01
65.9273.5962.5475.6656.745.2362.5185.21
2026.01
64.8969.5357.1777.9474.833.1769.0672.58
2026.01
64.8265.0951.1971.879.4944.6272.6968.85
2026.01
61.1566.3158.1949.0582.0835.9875.3-
2026.01
60.2463.6258.4546.1781.3238.7173.16-
2026.01
56.9856.4958.9630.8680.9448.5372.0650.99