Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Pythia

Benchmarks

Task NameDataset NameSOTA ResultTrend
Activation ReconstructionPythia model activations
Pearson Correlation Coefficient0.8074
18
Memorization mitigationPythia 6.9B
Memory Usage (%)89.31
9
Memorization mitigationPythia 2.8B
Memory Usage (%)5.94
9
Jailbreak AttackPythia-12B
Attack Success Rate70
4
Showing 4 of 4 rows