Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

General Language Modeling on General Benchmarks Llama 3.1 8B

66.5Generation Quality Score

Baseline

62.96463.88264.865.718May 9, 2025
Updated 4d ago

Evaluation Results

MethodLinks
2025.05
66.543.236.462.370
2025.05
66.24336.762.869
2025.05
66.243.235.562.470
2025.05
66.143.435.662.970
2025.05
66.143.435.362.769
2025.05
6641.936.159.869
2025.05
65.943.235.363.267
2025.05
65.741.935.163.367
2025.05
65.228.129.15758
2025.05
654135.448.566
2025.05
63.120.535.613.963