Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Vision-Language Compositional Reasoning on ARO
Loading...
0.804
Accuracy
CE-CLIP+
0.56584
0.62767
0.6895
0.75133
Mar 3, 2025
Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
CE-CLIP+
Training Source=COCO+C...
2025.03
0.804
MosaiCLIP
Training Source=COCO,...
2025.03
0.803
CE-CLIP
Training Source=COCO,...
2025.03
0.797
AMR-NegCLIP
Training Source=COCO,...
2025.03
0.794
SPARCL
Training Source=COCO,...
2025.03
0.772
NegCLIP
Training Source=COCO,...
2025.03
0.76
CLOVE
Training Source=LAION-...
2025.03
0.732
SPEC
Training Source=LAION,...
2025.03
0.701
syn-CLIP
Training Source=SyViC,...
2025.03
0.692
FIGCLIP
Training Source=VidSit...
2025.03
0.67
[79]
Training Source=COCO,...
2025.03
0.65
CLIP
Training Source=COCO,...
2025.03
0.641
CLIP
Num Real Images=0, Num...
2025.03
0.611
SDS-CLIP
Training Source=COCO,...
2025.03
0.575
Feedback
Search any
task
Search any
task