Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Vision-Language Compositional Evaluation on CC3M 50,000 random subset TripletData
Loading...
92.25
Text Score
TripletCLIP
34.27
49.3225
64.375
79.4275
Nov 4, 2024
Text Score
Image Score
Group Score
Updated 4d ago
Evaluation Results
Method
Method
Links
Text Score
Image Score
Group Score
TripletCLIP
Pre-training Data=CC12M
2024.11
92.25
66.82
64.3
NegCLIP
Pre-training Data=CC12M
2024.11
54.84
30.42
25.82
CLIP
Pre-training Data=CC12M
2024.11
52.69
29.66
24.64
NegCLIP++
Pre-training Data=CC12M
2024.11
36.5
30.67
20.11
Feedback
Search any
task
Search any
task