Share your thoughts, 1 month free Claude Pro on usSee more

Language Modeling on Pre-training corpus

1.577Loss

Muon

Updated 3mo ago

Evaluation Results

Method	Links
Muon 2026.04		1.577
Adam+Nexus 2026.04		1.602
AdamW 2026.04		1.606
Panda-3B 2025.10		2.619
Surefire-3B 2025.10		2.62
LLaMA-3.2-3B 2025.10		2.625
Panda-1B 2025.10		2.782
LLaMA-3.2-1B 2025.10		2.803
Surefire-1B 2025.10		2.804