Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

MR2Bench

Benchmarks

Task NameDataset NameSOTA ResultTrend
Multimodal Reward ModelingMR2Bench Video
Best-of-4 Accuracy50.7
18
Multimodal Reward ModelingMR2Bench Image
Best-of-4 Accuracy87.1
18
Reward ModelingMR2Bench Video
BoN Score50.7
15
Showing 3 of 3 rows