Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

The Pile

Benchmarks

Task NameDataset NameSOTA ResultTrend
Language ModelingThe Pile (test)
PPL (The Pile Test)9.213
27
Language ModelingThe Pile
Perplexity4.14
25
Language ModelingThe Pile deduplicated (val)
Perplexity7.14
22
Language ModelingThe Pile (val)
Perplexity (bits/byte)0.62
20
Language ModelingThe Pile non-copyrighted (test)
BPB0.557
20
Knowledge UnlearningThe Pile 32 sample (val)
EL10 (%)0
15
Membership Inference AttackThe Pile
AUROC0.927
14
Training Data ExtractionThe Pile (train)
Exact Extract Rate45
10
Data ExtractionThe Pile (test)
Fractional Extraction Rate63.4
10
Language ModelingThe Pile non-copyrighted without Wikipedia (test)
BPB0.559
8
Membership Inference AttackThe PILE (train test)
Loss66.5
7
Property-based retrievalThe Pile (test)
MAP54.2
6
Unsupervised OOD detectionThe Pile (ID) Twitter (OOD) (test)
AUROC99.22
5
Unsupervised OOD detectionThe Pile EDGAR Reports ID OOD (test)
AUROC68.09
5
Unsupervised OOD detectionThe Pile ID 4Chan OOD (test)
AUROC87.97
5
Showing 15 of 15 rows