Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Image Editing on Image Editing Dataset (Counting)
Loading...
4.4
Unchanged Regions
MLLM-as-a-Judge
3.35272
3.62461
3.8965
4.16839
Feb 13, 2026
Unchanged Regions
Global Consistency
Identity Preservation
Scale Realism
Spatial Relationship
Texture and Detail
Image Quality
Color and Lighting
Seamlessness
Alignment
Completeness
Plausibility
Overall Average
Updated 1mo ago
Evaluation Results
Method
Method
Links
Unchanged Regions
Global Consistency
Identity Preservation
Scale Realism
Spatial Relationship
Texture and Detail
Image Quality
Color and Lighting
Seamlessness
Alignment
Completeness
Plausibility
Overall Average
MLLM-as-a-Judge
Judge=Our Judge
2026.02
4.4
4.8
6.4
6.4
4.8
5.8
6.2
6.2
5.6
5.2
5.2
6.8
5.65
Human
Judge=Human
2026.02
3.393
5.243
4.227
5.51
4.65
5.157
5.247
5.403
5.357
3.437
3.537
4.917
4.673
Feedback
Search any
task
Search any
task