Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Mixtral

Benchmarks

Task NameDataset NameSOTA ResultTrend
Training EfficiencyMixtral-8x22b-G8T8 Fine-grained
MFU28.8
5
Training EfficiencyMixtral-8x22B Coarse-grained
MFU49.3
5
Training Stability AnalysisMixtral 8x1B pre-training
Num Spikes0
2
Showing 3 of 3 rows