Share your thoughts, 1 month free Claude Pro on usSee more

Human Preference Agreement on MM-RewardBench2 T2I

78.9Accuracy

Gemini 3.1 Pro + ARR

Updated 2mo ago

Evaluation Results

Method	Links
Gemini 3.1 Pro + ARR 2026.05		78.9
Gemini 3.1 Pro 2026.05		75.1
GPT-5 + ARR 2026.05		74.7
GPT-5 2026.05		70.5
UnifiedReward-Thinking 2026.05		66
Qwen3vl-8B + ARR 2026.05		62.7
HPSv3 2026.05		60.2
UnifiedReward 2026.05		59.8
PickScore 2026.05		58.6
Qwen3-VL-8B 2026.05		57.6
HPSv2 2026.05		54.7
ImageReward 2026.05		54
CLIPScore 2026.05		51