Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Reasoning-informed Image Editing on RISEBench 1.0 (test)

54.1Temporal Reasoning Score

GPT-Image-1.5

-0.91613.36727.6541.933Mar 10, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
2026.03
54.1606221.25069.792.594.9
2026.03
41.261.14837.647.27785.594.4
2026.03
12.912.2117.110.858.967.491.2
2026.03
4.710172.48.937.266.486.9
2026.03
4.77.81.85.99.443.964.479.7
2026.03
3.52.253.53.635.652.775.9
2026.03
2.41.141.22.23450.772.3
2026.03
2.45.6141.26.136.553.573
2026.03
2.35.5131.25.82671.685.2
2026.03
1.23.342.42.833.952.772.9