| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Visual World Modelling | AURORA-BENCH Average | GPT-4o Score7.36 | 18 | |
| Instruction-guided image editing preference prediction | AURORA-Bench | Accuracy63.62 | 12 | |
| Action-centric Editing | AURORA-BENCH All (test) | Human Eval Score-0.23 | 4 |