| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Visual Question Answering | SIXray and PIDray combined (test) | Instance Location (IL)49.22 | 10 | |
| Visual Grounding | SIXray and PIDray combined (test) | Acc@0.512.5 | 7 | |
| Scene Comprehension | SIXray and PIDray combined (test) | F1 Score34.7 | 6 |