Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Language Modeling and Reasoning Evaluation on Open LLM Leaderboard

82.8ARC

DPO

46.81656.15865.574.842Feb 23, 2024Mar 11, 2024Mar 29, 2024Apr 16, 2024May 4, 2024May 22, 2024Jun 9, 2024
Updated 1mo ago

Evaluation Results

MethodLinks
2024.06
82.850.674.35265.863.1-
2024.06
82.748.774.953.965.463-
2024.06
82.550.874.453.166.162.9-
2024.06
82.551.174.451.866.262.8-
2024.06
82.446.57452.365.163-
2024.06
81.64273.945.263.262.6-
2024.02
73.3447.0774.0316.2279.9546.5556.19
2024.02
72.0744.8475.8515.0178.644.4255.13
2024.02
72.0447.5972.8511.977.8642.2754.09
2024.04
59.437.476.622.582.155.755.6
2024.04
59.437.476.622.582.155.755.6
2024.04
59.437.476.622.582.155.755.6
2024.04
58.638.877218254.655.3
2024.04
57.938.975.719.681.954.254.7
2024.04
57.436.176.820.481.554.654.5
2024.04
57.33876.218.481.153.554.1
2024.04
56.738.975.417.88152.553.7
2024.04
56.53674.416.479.252.452.5
2024.04
55.53975.116.880.752.953.3
2024.04
55.338.875.51680.751.953
2024.04
54.134.975.415.679.150.351.6
2024.04
53.238.87414.578.646.751
2024.04
53.238.87414.578.646.751
2024.04
53.238.87414.578.646.751
2024.04
523972.210.677.543.749.2
2024.04
5239.172.911.177.643.849.4
2024.04
51.539.573.512.977.844.449.9
2024.04
50.638.671.810.47743.948.7
2024.04
49.738.172.48.876.341.647.8
2024.04
49.237.371.67.375.540.646.9
2024.04
48.638.871.78.173.939.446.8
2024.04
48.639.170.85.472.938.245.8
2024.04
48.237.770.86.775.43946.3