Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Human Preference Agreement on MM-RewardBench2 T2I
Loading...
78.9
Accuracy
Gemini 3.1 Pro + ARR
49.884
57.417
64.95
72.483
May 8, 2026
Accuracy
Updated 22d ago
Evaluation Results
Method
Method
Links
Accuracy
Gemini 3.1 Pro + ARR
Model Category=ARR (Ou...
2026.05
78.9
Gemini 3.1 Pro
Model Category=VLM-as-...
2026.05
75.1
GPT-5 + ARR
Model Category=ARR (Ou...
2026.05
74.7
GPT-5
Model Category=VLM-as-...
2026.05
70.5
UnifiedReward-Thinking
Model Category=Trained...
2026.05
66
Qwen3vl-8B + ARR
Model Category=ARR (Ou...
2026.05
62.7
HPSv3
Model Category=Trained...
2026.05
60.2
UnifiedReward
Model Category=Trained...
2026.05
59.8
PickScore
Model Category=Trained...
2026.05
58.6
Qwen3-VL-8B
Model Category=VLM-as-...
2026.05
57.6
HPSv2
Model Category=Trained...
2026.05
54.7
ImageReward
Model Category=Trained...
2026.05
54
CLIPScore
Model Category=Trained...
2026.05
51
Feedback
Search any
task
Search any
task