Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

MM-RewardBench

Benchmarks

Task NameDataset NameSOTA ResultTrend
Human Preference AgreementMM-RewardBench2 T2I
Accuracy78.9
13
Text-to-image preference evaluationMM-RewardBench T2I 2
Accuracy78.9
11
Human Preference AgreementMM-RewardBench2 Edit
Accuracy79.2
7
Image editing preference evaluationMM-RewardBench2 Edit
Accuracy79.2
7
Showing 4 of 4 rows