Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Unsupervised OOD detection on The Pile (ID) Twitter (OOD) (test)
Loading...
99.22
AUROC
Perplexity
80.3544
85.2522
90.15
95.0478
Feb 5, 2026
AUROC
FPR@95
Updated 4d ago
Evaluation Results
Method
Method
Links
AUROC
FPR@95
Perplexity
Base Model=Pythia-160M...
2026.02
99.22
2.85
AP-OOD
Base Model=Pythia-160M...
2026.02
99.08
1.52
Mahalanobis
Base Model=Pythia-160M...
2026.02
97.86
10.91
KNN
Base Model=Pythia-160M...
2026.02
81.5
61.12
Deep SVDD
Base Model=Pythia-160M...
2026.02
81.08
76.72
Feedback
Search any
task
Search any
task