Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Human Preference Agreement on MM-RewardBench2 Edit

79.2Accuracy

Gemini 3.1 Pro + ARR

58.463.869.274.6May 8, 2026
Updated 22d ago

Evaluation Results

MethodLinks
2026.05
79.2
2026.05
77.5
2026.05
77.4
2026.05
73.8
2026.05
67.2
2026.05
65.5
2026.05
59.2