Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Language Modeling on GPT Pre-training (val)

19.98Validation Perplexity

MUON+

19.591622.213324.83527.4567Feb 25, 2026
Updated 19d ago

Evaluation Results

MethodLinks
2026.02
19.98
2026.02
21.31
2026.02
21.91
2026.02
22.38
2026.02
27.64
2026.02
28.44
2026.02
29.27
2026.02
29.69