| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| General Robust Image Task (GRIT) multi-task evaluation | GRIT ablation set (same) | Categorization Accuracy85 | 38 | |
| Referring Expression Comprehension | GRIT refexp | Accuracy78.61 | 15 | |
| Multi-task Vision and Language Evaluation | GRIT (test) | Overall Score67 | 14 | |
| Multi-task vision and language evaluation | GRIT (General Robustness and Information Transfer) unrestricted track (test) | Captioning Acc55.1 | 2 |