Share your thoughts, 1 month free Claude Pro on usSee more

Language Modeling on Public Pretraining Dataset (train)

1.33Loss

Adam

Updated 6d ago

Evaluation Results

Method	Links
Adam 2026.04		1.33
Adam 2026.04		1.331
Nexus 2026.04		1.338
Nexus 2026.04		1.338