| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Reward Modeling | EditReward-Bench | PF85.4 | 17 | |
| Multi-way preference ranking | EDITREWARD-BENCH | Preference Score (K=2)58.61 | 11 | |
| Visual Consistency Assessment | EditReward-Bench 2025 (test) | Subject Addition Accuracy80.88 | 6 |