Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Language Model Pre-training on C4 Llama-1B scratch (val)
Loading...
2.638
Validation Loss
SF-AdamW
2.636524
2.646487
2.65645
2.666413
Dec 18, 2025
Validation Loss
Updated 1mo ago
Evaluation Results
Method
Method
Links
Validation Loss
SF-AdamW
Number of inner steps...
2025.12
2.638
SF-AdamW
Number of inner steps...
2025.12
2.638
SF-AdamW
Number of inner steps...
2025.12
2.638
SF-AdamW
Number of inner steps...
2025.12
2.638
GPA-AdamW
Number of inner steps...
2025.12
2.645
GPA-AdamW
Number of inner steps...
2025.12
2.6553
DiLoCo-AdamW
Number of inner steps...
2025.12
2.6558
DiLoCo-AdamW
Number of inner steps...
2025.12
2.6572
DiLoCo-AdamW
Number of inner steps...
2025.12
2.6577
GPA-AdamW
Number of inner steps...
2025.12
2.6602
GPA-AdamW
Number of inner steps...
2025.12
2.6639
DiLoCo-AdamW
Number of inner steps...
2025.12
2.6737
AdamW
Number of inner steps...
2025.12
2.6749
AdamW
Number of inner steps...
2025.12
2.6749
AdamW
Number of inner steps...
2025.12
2.6749
AdamW
Number of inner steps...
2025.12
2.6749
Feedback
Search any
task
Search any
task