| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Visual Entailment | SNLI-VE (test) | Overall Accuracy91.2 | 197 | |
| Visual Entailment | SNLI-VE (val) | Overall Accuracy85 | 109 | |
| Visual Entailment | SNLI-VE (dev) | Accuracy91 | 70 | |
| Visual Entailment | SNLI-VE | Accuracy0.842 | 24 | |
| Visual Entailment | SNLI-VE (test-p) | Accuracy80.2 | 24 | |
| Multimodal Classification | SNLI-VE (test) | Accuracy81.64 | 22 | |
| Visual Entailment | SNLI-VE std | Accuracy82.3 | 8 | |
| Multimodal Classification | SNLI-VE | Accuracy89.2 | 6 | |
| Visual Entailment | SNLI-VE | GFLOPs7.7 | 4 |