Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

NLVR2

Benchmarks

Task NameDataset NameSOTA ResultTrend
Natural Language Visual ReasoningNLVR2 (test-p)
Accuracy92.6
346
Natural Language Visual ReasoningNLVR2 (dev)
Accuracy91.51
307
Visual ReasoningNLVR2
Accuracy92.6
49
Visual ReasoningNLVR2 (test)
Accuracy85.15
46
Adversarial AttackNLVR2
Attack Success Rate67.51
37
Natural Language Visual ReasoningNLVR2
Accuracy87.3
21
Visual ReasoningNLVR2 (test-P)
Accuracy92.6
21
Visual reasoningNLVR2 (val)
Accuracy91.1
20
Visual ReasoningNLVR2 v2 (dev)
Accuracy88.7
20
Visual ReasoningNLVR2 (dev)
Accuracy82.5
18
Natural Language Visual ReasoningNLVR2 (test)
Accuracy85.36
16
Natural Language Visual ReasoningNLVR2 std
Accuracy85.5
14
Visual ReasoningNLVR2 (test-dev)
Accuracy79.87
14
Multi-image reasoning and general capability evaluationNLVR2
Accuracy90.42
12
Natural Language Visual ReasoningNLVR2 (val)
Accuracy83.15
12
Multi-image UnderstandingNLVR2 (test)
Accuracy87.3
9
Visual ReasoningNLVR2 loc (val)
Accuracy77.27
5
Visual ReasoningNLVR2
GFLOPs20.01
4
Natural Language Visual ReasoningNLVR2
GFLOPs17.4
4
Natural Language Visual ReasoningNLVR2 (test-u)
Accuracy67.3
2
Showing 20 of 20 rows