Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Language Modeling on Pre-training corpus

1.577Loss

Muon

1.527921.859212.19052.52179Oct 21, 2025Nov 18, 2025Dec 17, 2025Jan 14, 2026Feb 12, 2026Mar 12, 2026Apr 10, 2026
Updated 6d ago

Evaluation Results

MethodLinks
2026.04
1.577
2026.04
1.602
2026.04
1.606
2025.10
2.619
2025.10
2.62
2025.10
2.625
2025.10
2.782
2025.10
2.803
2025.10
2.804