Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Multimodal Reward Modeling on MR2Bench Video

50.7Best-of-4 Accuracy

Molmo2-4B Multi-response RM

36.03639.84343.6547.457Apr 13, 2026
Updated 4d ago

Evaluation Results

MethodLinks
50.7
2026.04
50.1
2026.04
49.9
49.7
49.1
2026.04
48.7
2026.04
47.9
2026.04
47.7
2026.04
47.7
47.5
2026.04
46.7
2026.04
44.9
2026.04
44.4
2026.04
43.2
2026.04
42.6
2026.04
40.4
2026.04
40.2
2026.04
36.6