Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

VLGuard

Benchmarks

Task NameDataset NameSOTA ResultTrend
Vision-text safety classificationVLGuard
AUPRC (Prompt)0.8843
9
Unsafe content detectionVLGuard
F1 Score79.3
9
Jailbreak AttackVLGuard Safe
Attack Success Rate (ASR)8.44
5
Jailbreak AttackVLGuard Image Unsafe
ASR52.49
5
Jailbreak AttackVLGuard Text Unsafe
ASR34.59
5
Jailbreak AttackVLGuard (All)
ASR17.88
5
Showing 6 of 6 rows