Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Proofpile

Benchmarks

Task NameDataset NameSOTA ResultTrend
Language ModelingProofpile (test)
Performance (8K Context)3.68
12
Language ModelingProofPile (16K)
Perplexity3.24
8
Language ModelingProofPile (4K)
Perplexity3.26
8
Language ModelingProofPile 100K
Perplexity3.19
6
Language ModelingProofPile 32K
Perplexity3.21
6
Showing 5 of 5 rows