VLGuard

Benchmarks

Task Name	Dataset Name	SOTA Result
Target Response Induction	VLGuard	ASR0.1	48
Safety and Helpfulness Evaluation	VLGuard	Safety Score95.1	32
Safety Evaluation	VLGuard	ASR0.23	27
Multimodal Safety	VLGuard	F1 Score95.11	15
Vision-text safety classification	VLGuard	AUPRC (Prompt)0.8843	9
Unsafe content detection	VLGuard	F1 Score79.3	9
Multimodal Safety Evaluation	VLGuard (test)	Accuracy86.78	6
Multimodal Jailbreaking	VLGuard Unsafe (OOD)	ASR66.7	6
Over-Prudence Evaluation	VLGuard	RR (Before)4.48	6
Jailbreak Attack	VLGuard Safe	Attack Success Rate (ASR)8.44	5
Jailbreak Attack	VLGuard Image Unsafe	ASR52.49	5
Jailbreak Attack	VLGuard Text Unsafe	ASR34.59	5
Jailbreak Attack	VLGuard (All)	ASR17.88	5
Safety Robustness	VLGuard Unsafe	Attack Success Rate13.4	4
Safety Robustness	VLGuard Safe_Unsafe	Attack Success Rate14.9	4

Showing 15 of 15 rows