| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Language Modeling | FineWeb-Edu (test) | Perplexity (Test)20.7 | 49 | |
| Language Modeling | FineWeb-Edu 500M-token (val) | Valid Loss2.221 | 18 | |
| Soft Search | FineWeb-Edu English, 1.4T tokens (test) | Similarity Score100 | 12 | |
| Language Modeling | FineWeb-Edu (val) | Final Validation Loss4.2838 | 8 | |
| Language Modeling | Fineweb-edu distillation 8B to 300M | LM Loss2.74 | 7 | |
| Language Modeling | FineWeb-Edu 1.4B tokens (val) | Loss3.271 | 3 |