Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Winoground

Benchmarks

Task NameDataset NameSOTA ResultTrend
Compositional Vision-Language ReasoningWinoground
Text Score89.5
61
Compositional Scene UnderstandingWinoground
Text Alignment Score64
44
Compositional ReasoningWinoground
Group Score41.25
30
Image-Text MatchingWinoground
Text Agreement Score89.5
26
Vision-Language Compositional ReasoningWinoground 1.0 (test)
Text Score89.5
23
Compositional EvaluationWinoground (test)
Text Score74
15
Visual Question AnsweringWinogroundVQA v1.0 (test)
Accuracy46.5
14
Fine-grained retrievalWinoground (test)
Text Agreement (%)40
12
Image-text alignmentWinoground (test)
Text Score89.5
12
Fine-grained Image-Text MatchingWinoground
Group Agreement25.8
11
Image-Text RetrievalWinoground (test)
Text Score74
10
Vision-Language ReasoningWinoground
Simple Acc59.88
9
Text-to-image retrievalWinoground
R@1 (T2I)0.133
8
Vision-Language Compositional ReasoningWinoground standard (test)
Text Score75.5
7
Text SelectionWinoground
Text Score34
7
Image SelectionWinoground
Image Score14
7
Vision-Language Compositional ReasoningWinoground (test)
Text Score61.3
7
Vision-Language UnderstandingWinoground
Text Accuracy61.5
5
Image-Text MatchingWinoground 1.0 (full)
Text Agreement Score89.5
5
Vision-Language ReasoningWinoground
Text Score30.5
4
Compositional EvaluationWinoground Txt2Img
Txt2Img Score14
4
Image-Text MatchingWinoground clean
Text Agreement Score52.63
4
Vision-Language AlignmentWinoground
Accuracy63.38
3
Image-Text MatchingWinoground (full)
Accuracy52.7
3
Compositional ReasoningWinoground (test)
Image Accuracy27
3
Showing 25 of 28 rows