Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Differences Caption Generation on MagicBrush (test)
Loading...
50
Main Difference Count
GPT-4o
3.2
15.35
27.5
39.65
Jun 11, 2025
Main Difference Count
MP Score
MP Soft Score
HR
HR Soft Score
Average Difference
No Diffs Rate
Updated 4d ago
Evaluation Results
Method
Method
Links
Main Difference Count
MP Score
MP Soft Score
HR
HR Soft Score
Average Difference
No Diffs Rate
GPT-4o
2025.06
50
11
12
61
57
1.9
0
Qwen2.5 VL
2025.06
37
11
1
63
59
1.5
60
GPT-4
2025.06
34
8
8
75
74
2.5
80
GPT-4 Turbo
2025.06
34
7
8
78
75
1.5
450
InternVL3
2025.06
25
8
11
88
83
3.4
230
LLaVA
training=Supervised
2025.06
12
-
-
-
-
-
-
LLaVA
2025.06
5
-
-
-
-
-
-
Feedback
Search any
task
Search any
task