Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

SafeBench

Benchmarks

Task NameDataset NameSOTA ResultTrend
Jailbreak AttackSafeBench
ASR0
245
Jailbreak AttackSafeBench
HF1.7
54
Jailbreak Attack EvaluationSafeBench 100 sampled harmful queries
ASR97
48
Jailbreak AttackSafeBench Tiny
ASR100
24
Jailbreak attackSafebench (test)
IA ASR92
20
Safety EvaluationSafeBench
Overall Safety Score99
19
Critical Scenario GenerationSafeBench Scenario 6 (Unprotected Left-turn)
Collision Rate (CR)0
16
Jailbreak AttackSafeBench
ADU Success Rate100
16
Safety-critical scenario generationSafeBench
Collision Rate: Straight Obstacle30
8
Scene Criticality GenerationSafeBench
Collision Rate (CN)22
6
Multimodal JailbreakingSafeBench FigStep (ID)
ASR92.3
6
Multimodal JailbreakingSafeBench QR (ID)
ASR0
6
Multimodal JailbreakingSafeBench Mirror (ID)
ASR100
6
Safety evaluation of autonomous drivingSafeBench critical scenarios
Collision Rate (SO)2.1
5
Multimodal Safety EvaluationSafeBench
FS ASR3.26
4
Autonomous Driving Ego Robustness EvaluationSafeBench held-out (test)
CR (Straight Obstacle)5
3
JailbreakingSafeBench evaluated on OpenAI-o1
FS34.8
1
Showing 17 of 17 rows