Share your thoughts, 1 month free Claude Pro on usSee more

Language Modeling on GPT Pre-training (val)

19.98Validation Perplexity

MUON+

Updated 2mo ago

Evaluation Results

Method	Links
MUON+ 2026.02		19.98
NorMuon 2026.02		21.31
Turbo-Muon 2026.02		21.91
AdaMuon 2026.02		22.38
MUON+ 2026.02		27.64
NorMuon 2026.02		28.44
AdaMuon 2026.02		29.27
Turbo-Muon 2026.02		29.69