Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Multimodal Reward Modeling on Multimodal RewardBench

88.79Accuracy

Gemini 3.1 Pro

40.128452.761765.39578.0283Feb 2, 2026Feb 15, 2026Feb 28, 2026Mar 13, 2026Mar 26, 2026Apr 8, 2026Apr 21, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
2026.04
88.79-
2026.04
87.9-
2026.02
85.4-
2026.02
84.3-
2026.04
83.02-
2026.02
82.5-
2026.04
82.25-
2026.04
82.23-
2026.02
82.2-
2026.04
82.06-
2026.04
81.91-
2026.04
81.19-
2026.04
79.579.3
2026.04
79.35-
2026.04
77.777.7
2026.04
75.82-
2026.04
74.25-
2026.04
72.49-
71.9-
2026.02
71.5-
2026.04
71.370.5
2026.02
70.9-
70.8-
2026.04
69.12-
2026.04
67.974.4
2026.02
67.1-
2026.02
66.6-
2026.02
65.9-
2026.04
65.765.7
2026.04
65.772.8
2026.04
65.262.1
2026.02
64.4-
2026.02
64-
2026.04
63.571.9
2026.04
62.670.8
2026.04
62.671.5
2026.04
61.767.1
2026.04
60.775.3
2026.04
60.766.6
2026.04
60.748.9
2026.04
6061.2
2026.04
59.663.6
2026.04
58.471.9
2026.04
57.851.2
2026.04
56.470.9
2026.02
54.8-
2026.02
53.6-
2026.02
50.5-
2026.04
4753
2026.02
42-