Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

VL-Checklist

Benchmarks

Task NameDataset NameSOTA ResultTrend
Compositional ReasoningVL-Checklist
Attribute Score81.8
37
Image-Text MatchingVL-Checklist
AURC0.258
23
Vision-Language ProbingVL-CheckList (test)
Object: Avg86.9
17
Hard-negative classificationVL-CheckList Object
Score (Center)71.2
6
Vision-Language AlignmentVL-Checklist VG-spatial
Accuracy63.5
4
Showing 5 of 5 rows