Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Image editing preference evaluation on EditReward-Bench
Loading...
63.27
Accuracy
Gemini 3.1 Pro
53.6396
56.1398
58.64
61.1402
May 8, 2026
Accuracy
Updated 22d ago
Evaluation Results
Method
Method
Links
Accuracy
Gemini 3.1 Pro
ARR=true
2026.05
63.27
Gemini 3.1 Pro + ARR
Model Category=ARR (Ou...
2026.05
63.27
Gemini 3.1 Pro
ARR=false
2026.05
61.23
Gemini 3.1 Pro
Model Category=VLM-as-...
2026.05
61.23
GPT-5
ARR=true
2026.05
61.01
GPT-5 + ARR
Model Category=ARR (Ou...
2026.05
61.01
GPT-5
ARR=false
2026.05
57.53
GPT-5
Model Category=VLM-as-...
2026.05
57.53
Qwen3-VL-8B
ARR=true
2026.05
57.22
Qwen3vl-8B + ARR
Model Category=ARR (Ou...
2026.05
57.22
EditReward
2026.05
56.45
EditReward
Model Category=Trained...
2026.05
56.45
Qwen3-VL-8B
ARR=false
2026.05
54.01
Qwen3-VL-8B
Model Category=VLM-as-...
2026.05
54.01
Feedback
Search any
task
Search any
task