Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Vision-Language Reasoning on Winoground
Loading...
59.88
Simple Acc
LLaVA-1.5 13B
52.7872
54.6286
56.47
58.3114
Jan 18, 2026
Simple Acc
Paired Acc
Yes-Rate
Updated 4d ago
Evaluation Results
Method
Method
Links
Simple Acc
Paired Acc
Yes-Rate
LLaVA-1.5 13B
Intervention Strategy=...
2026.01
59.88
24.63
74.38
LLaVA-1.5 13B
Intervention Strategy=...
2026.01
58.56
21.5
79.31
Q4 system redistr (prop)
Intervention=Q4 system...
2026.01
57.88
19.5
75.38
Q4 text redistr (prop)
Intervention=Q4 text r...
2026.01
55
12
92
Image×2.0
Intervention=Image×2.0
2026.01
53.69
8.85
94.56
Q4 system abl
Intervention=Q4 system...
2026.01
53.56
8.13
94.81
PAI
Intervention=PAI
2026.01
53.5
7.63
95.25
No intervention baseline
Intervention=None
2026.01
53.38
7.38
95.38
AD-HH
Intervention=AD-HH
2026.01
53.06
6.63
95.94
Feedback
Search any
task
Search any
task