Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

General Benchmarks

Benchmarks

Task NameDataset NameSOTA ResultTrend
General Multimodal UnderstandingGeneral Benchmarks
Average Score74
12
General Language ModelingGeneral Benchmarks Llama 3.1 8B
Generation Quality Score66.5
11
Natural Language Understanding and ReasoningGeneral Benchmarks Italian
ARC-C-it37.47
6
General Language UnderstandingGeneral Benchmarks (MMLU, AlpacaEval, Arena-Hard)
MMLU Accuracy73.41
4
Showing 4 of 4 rows