Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Language Modeling on NanoGPT OpenWebText

391,100Throughput (tokens/s)

AdamW

-148101,426203,000304,574Mar 18, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
2026.03
391,100-1.82
2026.03
342,400-3.14
2026.03
295,3001.38-
2026.03
214,600--
2026.03
185,8001.7-
2026.03
131,800-2.81
2026.03
130,000-1.62
2026.03
125,700-1.43
2026.03
113,7001.29-
2026.03
109,000--
2026.03
108,0001.35-
2026.03
88,000--
2026.03
87,0001.85-
2026.03
80,300--
2026.03
73,700-1.15
2026.03
68,0001.06-
2026.03
64,900-1.13
2026.03
64,000--
2026.03
62,4001.08-
2026.03
57,600--
2026.03
46,900--
2026.03
28,200-1.89
2026.03
23,0001.54-
2026.03
14,900--