Share your thoughts, 1 month free Claude Pro on usSee more

Multimodal Safety Evaluation on MM-SafeBench

1.04Forbidden Statements ASR

GPT-4o

Updated 4mo ago

Evaluation Results

Method	Links
GPT-4o 2024.11		1.04	13.23	95.21	95.48	95.91	96.4
Claude-3.5-Sonnet 2024.11		2.44	2.78	40.02	48.62	37.12	9.28
GPT-4o-Mini 2024.11		16.13	12.06	94.78	94.78	93.16	93.16
Qwen-VL-Max 2024.11		27.84	48.49	91.76	92.34	91.42	92.23