Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

SugarCrepe

Benchmarks

Task NameDataset NameSOTA ResultTrend
Compositional ReasoningSugarCrepe
Overall Accuracy87.5
43
Language CompositionalitySugarCrepe (test)
Replace: Object (R@1)100
21
Vision-Language CompositionalitySugarCrepe
Accuracy88.06
20
Compositional EvaluationSugarCrepe (test)
Replace (Object)95.52
20
Image-Text MatchingSugarCrepe
AURC16.7
17
Image-Text Compositionality EvaluationSugarCrepe ++ (test)
Swap Object ITT100
17
Compositional EvaluationSugarCrepe swap att (test)
Accuracy82.1
13
Visual Question AnsweringSugarCrepe
Simple Accuracy82.14
9
Compositional Image-Text MatchingSugarCrepe
Replacement Score88.7
9
Compositional ReasoningSugarCrepe 1.0 (test)
Replace Acc (Object)100
8
Language CompositionalitySugarCrepe 1.0 (test)
Recall@1 (Replace, Object)88.1
8
Image-to-text retrievalSugarCrepe
R@1 (Add)73.8
8
Vision-Language ReasoningSugarCrepe (test)
Simple Accuracy62.75
7
Image-Caption AlignmentSugarCrepe (test)
Replace Object96.9
7
Hallucination ReasoningSugarCrepe
Accuracy86.4
5
Vision-Language Compositional ReasoningSugarCrepe++
Accuracy66.1
5
Image-Text MatchingSugarCREPE sampled balanced
Accuracy58.7
3
Paired-prompt evaluationSugarCrepe
Simple Accuracy64.56
2
Showing 18 of 18 rows