Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

FineWeb

Benchmarks

Task NameDataset NameSOTA ResultTrend
Language ModelingFineWeb (val)
Validation Loss2.03
156
Language ModelingFineWeb 100M token (val)
Perplexity12.11
9
LLM PretrainingFineWeb-Edu (val)
BPB0.861
8
LLM PretrainingFineWeb-Edu (train)
Training Loss2.964
8
Data FilteringFineWeb-edu CC-MAIN-2024-10
Recall@3081.9
7
Speculative DecodingFineweb-edu distillation 8B to 300M
Spec. Accept %62
7
Language ModelingFineWeb-Edu
PPL12.318
6
Speculative DecodingFineweb-edu 1.0 (test)
Speculative Accept Rate0.735
6
Model CalibrationFineweb-edu 1.0 (test)
ECE0.002
6
Language ModelingFineweb-edu 1.0 (test)
LM Loss2.32
6
Language IdentificationFineWeb2
Macro F194.52
5
Language ModelingFineweb-Edu
Accuracy0.3919
3
Pre-trainingFineWeb 124M Transformer (val)
Training Time to Loss 3.28 (min)2.1
3
Language model trainingFineWeb (val)
Training Time (s)95.2
2
Showing 14 of 14 rows