Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Pre-training corpus

Benchmarks

Task NameDataset NameSOTA ResultTrend
Language ModelingPre-training corpus (train)
Perplexity15.71
20
Language ModelingPre-training corpus
Loss1.577
9
Next token predictionPre-training corpus (train)
Token Accuracy66.4
9
Showing 3 of 3 rows