Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

C4

Benchmarks

Task NameDataset NameSOTA ResultTrend
Language ModelingC4
Perplexity1
1,688
Language ModelingC4
Perplexity4.77
1,565
Language ModelingC4 (val)
PPL1.7
737
Language ModelingC4 (test)
Perplexity4.97
464
Language GenerationC4
Perplexity5.52
190
PerplexityC4
Perplexity6.24
137
Language ModelingC4
C4 Loss2.55
121
Watermark DetectionC4
TPR @ FPR=1%1
95
Language ModelingC4
Perplexity7.34
72
Language ModelingC4
Perplexity9.36
58
Pre-trainingC4 (val)
Perplexity17.8
58
Language ModelingC4 (train)
PPL15.28
50
LLM PretrainingC4
Perplexity13.3
47
Language Model Pre-trainingC4 Llama 2 pre-training (val)
Perplexity13.19
47
Sentence-Level WatermarkingC4
AUROC100
40
WatermarkingC4
TPR (FPR < 10^-4)100
40
Language ModelingC4 LLaMA-130M (val)
Perplexity18.504
40
Language ModelingC4
Entropy1
39
Watermark DetectionC4
TPR @ 1% FPR100
36
Language ModelingC4
Log-PPL2.834
35
Masked Language ModelingC4 (val)
PPLX3.828
35
Feature Space PreservationC4
Cosine Similarity100
32
Language ModelingC4
Word Perplexity18.08
32
Next Token PredictionC4 (held-out)
Perplexity (PPL)21.5
30
ClusteringC4
Clustering Score63.95
30
Showing 25 of 145 rows