Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Difference Caption Generation on UltraEdit
Loading...
44
Main Difference Score
GPT-4o
2.4
13.2
24
34.8
Jun 11, 2025
Main Difference Score
MP Score
MP (soft)
Hit Rate (HR)
Hit Rate (HR soft)
Average Difference
No Diffs Accuracy
Updated 1mo ago
Evaluation Results
Method
Method
Links
Main Difference Score
MP Score
MP (soft)
Hit Rate (HR)
Hit Rate (HR soft)
Average Difference
No Diffs Accuracy
GPT-4o
2025.06
44
9
11
0.74
0.68
1.9
0
GPT-4
2025.06
36
7
9
0.82
0.8
2.5
0.8
GPT-4 Turbo
2025.06
36
7
8
0.8
0.77
1.5
4.5
InternVL3
2025.06
22
8
12
0.9
0.86
3.4
3.2
LLaVA
Training Protocol=Supe...
2025.06
9
-
-
-
-
-
-
Qwen2.5 VL
2025.06
8
7
9
0.84
0.8
2.5
0.8
LLaVA
2025.06
4
-
-
-
-
-
-
Feedback
Search any
task
Search any
task