Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

PG-19

Benchmarks

Task NameDataset NameSOTA ResultTrend
Language ModelingPG-19 (test)
Perplexity9.765
112
Needle-in-the-Haystack retrievalPG-19 mini 10K context
Accuracy (Needle-in-the-Haystack)100
30
Language ModelingPG-19 500M parameters scale (test)
PPLX40.72
20
Language modelingPG-19 (val)
Perplexity18.43
19
Online Language ModelingPG-19 (Whole Book)
PPL @ 50K18.87
17
Needle-in-the-Haystack retrievalPG-19 mini 100K context
Accuracy100
15
Needle-in-the-Haystack retrievalPG-19 mini 32K context
Accuracy100
15
Language ModelingPG-19 subword-level
Forward BPT3.94
6
Long-Context GenerationPG-19 60K context length
Throughput Speedup (micro)6.29
6
Long-Context GenerationPG-19 50K context length
Throughput Speedup (micro)5.79
6
Long-Context GenerationPG-19 40K context length
Throughput Speedup (micro)5.46
6
Long-Context GenerationPG-19 30K context length
Throughput Speedup (micro)4.75
6
Language ModelingPG-19 (dev)
Perplexity52.08
6
Compression CapacityPG-19 (test)
Max Tokens1,568
6
Long-range Next-token predictionPG-19 long-context
Perplexity (PPL)101.09
5
Language ModelingPG-19 128K context length
Perplexity7.244
2
Language ModelingPG-19 64K context length
Perplexity9.043
2
Language ModelingPG-19 8K context length
Perplexity12.313
2
Showing 18 of 18 rows