Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Difference Caption Generation on UltraEdit
Loading...
44
Main Difference Score
GPT-4o
2.4
13.2
24
34.8
Jun 11, 2025
Main Difference Score
MP Score
MP (soft)
Hit Rate (HR)
Hit Rate (HR soft)
Average Difference
No Diffs Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Main Difference Score
MP Score
MP (soft)
Hit Rate (HR)
Hit Rate (HR soft)
Average Difference
No Diffs Accuracy
GPT-4o
2025.06
44
9
11
0.74
0.68
1.9
0
GPT-4
2025.06
36
7
9
0.82
0.8
2.5
0.8
GPT-4 Turbo
2025.06
36
7
8
0.8
0.77
1.5
4.5
InternVL3
2025.06
22
8
12
0.9
0.86
3.4
3.2
LLaVA
Training Protocol=Supe...
2025.06
9
-
-
-
-
-
-
Qwen2.5 VL
2025.06
8
7
9
0.84
0.8
2.5
0.8
LLaVA
2025.06
4
-
-
-
-
-
-
Feedback
Search any
task
Search any
task