Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

MM-SafetyBench

Benchmarks

Task NameDataset NameSOTA ResultTrend
Safety EvaluationMM-SafetyBench
Average ASR0
98
Jailbreak Attack DefenseMM-SafetyBench
Attack Success Rate (ASR)0.2
56
Video JailbreakingMM-SafetyBench 1.0 (test)
Attack Success Rate96
48
Jailbreaking AttackMM-SafetyBench
Attack Success Rate (ASR)95
42
Multimodal Jailbreak AttackMM-SafetyBench (full)
ASR95.42
40
Jailbreaking MLLMsMM-SafetyBench
WASR100
32
Jailbreak AttackMM-SafetyBench (tiny)
ASR99.16
25
Safety EvaluationMM-SafetyBench v1.0 (test)
ASR0.6
24
Multimodal Safety EvaluationMM-SafetyBench
Safety Score2.73
22
Safety EvaluationMM-SafetyBench (test)
Helpfulness Score68.95
20
Jailbreak Attack Success EvaluationMM-SafetyBench SD+TYPO
ASR81.4
18
Jailbreak Attack Success EvaluationMM-SafetyBench TYPO
Attack Success Rate (ASR)79.6
18
Jailbreak Attack Success EvaluationMM-SafetyBench SD
ASR80.6
18
Direct MaliciousMM-SafetyBench OOD
ASR0.71
16
Response SafetyMM-SafetyBench (avg)
MS-R99
15
MLLM JailbreakingMM-SafetyBench Physical Harm scenario
ASR6
15
Multimodal Jailbreak DefenseMM-SafetyBench (full)
ASR (Illegal Activity - S)1.03
12
Multimodal Safety DefenseMM-SafetyBench SD_TYPO
Average ASR12
10
Multimodal Safety DefenseMM-SafetyBench SD
Average ASR0.09
10
Harmful Rate EvaluationMM-SafetyBench OCR (test)
Illegal Activity Rate0
10
Safety EvaluationMM-SafetyBench (MMSB)
Attack Success Rate (V-T AVG)1.8
9
Structured-based Jailbreak Attack DefenseMM-SafetyBench unseen attack types
ASR (SD)2.35
9
Multimodal Large Language Model Safety EvaluationMM-SafetyBench++
Illegal Activity Unsafe Refusal Rate100
9
Jailbreak DetectionMM-SafetyBench
AUROC99.18
9
Multimodal Safety EvaluationMM-SafetyBench SD + TYPO + SD_TYPO (test)
ASR Score0.08
8
Showing 25 of 35 rows