Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Causal Language Modeling on WikiText-103 GPT-2 (124M) (train)

4.4614Train Loss

GEM (N = 2)

4.4559884.4925194.529054.565581Apr 23, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
2026.04
4.4614
2026.04
4.4727
2026.04
4.4833
2026.04
4.4833
2026.04
4.5341
2026.04
4.5967