Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

AURORA-BENCH

Benchmarks

Task NameDataset NameSOTA ResultTrend
Visual World ModellingAURORA-BENCH Average
GPT-4o Score7.36
18
Instruction-guided image editing preference predictionAURORA-Bench
Accuracy63.62
12
Action-centric EditingAURORA-BENCH All (test)
Human Eval Score-0.23
4
Showing 3 of 3 rows