Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

VLRewardBench

Benchmarks

Task NameDataset NameSOTA ResultTrend
Reward ModelingVLRewardBench (test)
General84.7
24
Vision-Language Reward Model EvaluationVLRewardBench
Accuracy74.5
8
Showing 2 of 2 rows