Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Fineweb-edu

Benchmarks

Task NameDataset NameSOTA ResultTrend
Language ModelingFineWeb-Edu (test)
Perplexity (Test)20.7
49
Language ModelingFineWeb-Edu (val)
Final Validation Loss3.003
18
Language ModelingFineWeb-Edu 500M-token (val)
Valid Loss2.221
18
Language ModelingFineWeb-Edu 100B (val)
CE Loss2.62
13
Soft SearchFineWeb-Edu English, 1.4T tokens (test)
Similarity Score100
12
Language ModelingFineWeb-EDU (train)
Loss2.993
10
Language ModelingFineWeb-Edu 100B (eval)
Perplexity13.75
9
In-context LearningFineweb-Edu 16.8B tokens
ARC-c Accuracy36.86
8
Language ModelingFineweb-edu distillation 8B to 300M
LM Loss2.74
7
Language ModelingFineWeb-edu 12 text passages (held-out)
Average Loss1.683
3
Language ModelingFineWeb-Edu 1.4B tokens (val)
Loss3.271
3
Showing 11 of 11 rows