Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

VAJA

Benchmarks

Task NameDataset NameSOTA ResultTrend
Jailbreak Attack Success RateVAJA
Identity Attack Success Rate92.3
15
Toxicity GenerationVAJA (test)
Toxicity Score (Perspective API)17.9
9
Showing 2 of 2 rows