Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

LLM Evaluation on HuggingFace Open LLM Leaderboard Old (test)

92.08GSM8K Score

Qwen+SFT+UNA

35.857650.453865.0579.6462Oct 28, 2024
Updated 23d ago

Evaluation Results

MethodLinks
2024.10
92.0862.7482.471.585.3783.3679.58
2024.10
90.6866.783.5871.8485.5883.3580.29
2024.10
88.2958.9682.6470.98583.3478.19
2024.10
87.3858.818270.984.9983.3977.91
2024.10
85.7957.758270.3185.2283.1977.38
2024.10
85.6758.548270.1485.2483.3177.48
2024.10
85.5659.7782.1670.1485.3283.3277.71
2024.10
45.5751.18-63.8283.5462.4464.25
2024.10
42.5749.67-61.8683.8362.0663.23
2024.10
42.1947.83-62.1684.0362.3862.84
2024.10
41.5954.05-63.8284.4462.3364.34
2024.10
39.9949.54-62.4684.0862.363.02
2024.10
39.6551.06-63.9983.7861.9963.17
2024.10
38.0242.58-61.4383.4462.5160.93