Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

MSSBench

Benchmarks

Task NameDataset NameSOTA ResultTrend
SafetyMSSBench
Safety Score93.1
25
Multimodal Safety EvaluationMSSBench
Safety Score2.55
22
Safety EvaluationMSSbench (test)
Effectiveness Score99.66
20
Risk IdentificationMSSBench
RIR96.05
12
Safety-AwarenessMSSBench
Safety Rate70.5
12
Vision-text safety classificationMSSBench Embodied
AUPRC (Prompt)61.97
9
Vision-text safety classificationMSSBench Chat
AUPRC (Prompt)52.07
9
Safety AlignmentMSSBench
Safety Score79.18
4
Showing 8 of 8 rows