Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Language Modeling on Public Pretraining Dataset (train)
Loading...
1.33
Loss
Adam
1.32968
1.33184
1.334
1.33616
Apr 10, 2026
Loss
Updated 6d ago
Evaluation Results
Method
Method
Links
Loss
Adam
Model scale=3B
2026.04
1.33
Adam
Model scale=1B
2026.04
1.331
Nexus
Model scale=1B
2026.04
1.338
Nexus
Model scale=3B
2026.04
1.338
Feedback
Search any
task
Search any
task