Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Language Modeling on Pretraining Dataset

2.1506Train Loss (PT)

LLR

2.0815442.5476723.01383.479928Dec 26, 2025Jan 19, 2026Feb 12, 2026Mar 9, 2026Apr 2, 2026Apr 26, 2026May 21, 2026
Updated 12d ago

Evaluation Results

MethodLinks
2026.05
2.1506-8.74
2026.05
2.1543-8.66
2026.05
2.1722-8.86
2026.05
2.1758-9.02
2025.12
3.1333.10722.346
2025.12
3.163.13923.091
2025.12
3.1653.14223.156
2025.12
3.2033.1824.04
2025.12
3.2683.25425.908
2025.12
3.283.27126.342
2025.12
3.2813.27226.353
2025.12
3.2883.27926.545
2025.12
3.7093.69640.294
2025.12
3.8773.85547.244