| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Multi-way preference ranking | EDITREWARD-BENCH | Preference Score (K=2)66.2 | 23 | |
| Reward Modeling | EditReward-Bench | PF85.4 | 17 | |
| Image editing preference evaluation | EditReward-Bench | Accuracy63.27 | 14 | |
| Visual Consistency Assessment | EditReward-Bench 2025 (test) | Subject Addition Accuracy80.88 | 6 |