Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Helpfulness Evaluation on MM-Vet2 (test)
Loading...
54.4
GPT-Eval Score
No Defense
10.616
21.983
33.35
44.717
Aug 26, 2025
GPT-Eval Score
Updated 16d ago
Evaluation Results
Method
Method
Links
GPT-Eval Score
No Defense
Backbone=Qwen2-VL
2025.08
54.4
PRISM
Backbone=Qwen2-VL
2025.08
48.9
SPA-VL
Backbone=Qwen2-VL
2025.08
46.8
PRISM
Backbone=LLaVA-1.5
2025.08
20.4
SPA-VL
Backbone=LLaVA-1.5
2025.08
20.2
SafeRLHF-V
Backbone=LLaVA-1.5
2025.08
19.3
VLGuard
Backbone=Qwen2-VL
2025.08
17.7
No Defense
Backbone=LLaVA-1.5
2025.08
13.1
SafeRLHF-V
Backbone=Qwen2-VL
2025.08
12.9
VLGuard
Backbone=LLaVA-1.5
2025.08
12.3
Feedback
Search any
task
Search any
task