Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

EditReward-Bench

Benchmarks

Task NameDataset NameSOTA ResultTrend
Multi-way preference rankingEDITREWARD-BENCH
Preference Score (K=2)66.2
23
Reward ModelingEditReward-Bench
PF85.4
17
Image editing preference evaluationEditReward-Bench
Accuracy63.27
14
Visual Consistency AssessmentEditReward-Bench 2025 (test)
Subject Addition Accuracy80.88
6
Showing 4 of 4 rows