AURORA-BENCH

Benchmarks

Task Name	Dataset Name	SOTA Result
Visual World Modelling	AURORA-BENCH Average	GPT-4o Score7.36	18
Instruction-guided image editing preference prediction	AURORA-Bench	Accuracy63.62	12
Action-centric Editing	AURORA-BENCH All (test)	Human Eval Score-0.23	4

Showing 3 of 3 rows