| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Natural Language Visual Reasoning | NLVR2 (test-p) | Accuracy92.6 | 327 | |
| Natural Language Visual Reasoning | NLVR2 (dev) | Accuracy91.51 | 288 | |
| Visual Reasoning | NLVR2 | Accuracy92.6 | 49 | |
| Visual Reasoning | NLVR2 (test) | Accuracy85.15 | 44 | |
| Adversarial Attack | NLVR2 | Attack Success Rate67.51 | 37 | |
| Visual Reasoning | NLVR2 (test-P) | Accuracy92.6 | 21 | |
| Visual reasoning | NLVR2 (val) | Accuracy91.1 | 20 | |
| Visual Reasoning | NLVR2 v2 (dev) | Accuracy88.7 | 20 | |
| Visual Reasoning | NLVR2 (dev) | Accuracy82.5 | 16 | |
| Natural Language Visual Reasoning | NLVR2 (test) | Accuracy85.36 | 16 | |
| Natural Language Visual Reasoning | NLVR2 | Accuracy87.3 | 15 | |
| Natural Language Visual Reasoning | NLVR2 std | Accuracy85.5 | 14 | |
| Visual Reasoning | NLVR2 (test-dev) | Accuracy79.87 | 14 | |
| Natural Language Visual Reasoning | NLVR2 (val) | Accuracy83.15 | 12 | |
| Multi-image Understanding | NLVR2 (test) | Accuracy87.3 | 9 | |
| Visual Reasoning | NLVR2 loc (val) | Accuracy77.27 | 5 | |
| Natural Language Visual Reasoning | NLVR2 | GFLOPs17.4 | 4 | |
| Natural Language Visual Reasoning | NLVR2 (test-u) | Accuracy67.3 | 2 |