Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Jailbreak Evaluation

Benchmarks

Task NameDataset NameSOTA ResultTrend
Jailbreak AttackJailbreak Evaluation GPT-4o-mini
ASR93.33
13
Jailbreak AttackJailbreak Evaluation Average across models
ASR88.1
5
Jailbreak AttackJailbreak Evaluation DeepSeek-V3
ASR41.7
5
Jailbreak AttackJailbreak Evaluation Qwen3
Attack Success Rate (ASR)94.5
5
Jailbreak AttackJailbreak Evaluation Gemini-2-Flash
ASR37.5
5
Showing 5 of 5 rows