| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Visual Reasoning | REASONMAP-PLUS | Weighted Accuracy88.95 | 16 | |
| Visual Reasoning | REASONMAP Long questions | Weighted Accuracy62.5 | 16 | |
| Visual Reasoning | REASONMAP Short questions | Weighted Accuracy0.5998 | 16 | |
| High-level Planning | ReasonMap L (long questions) | Weighted Accuracy0.0747 | 3 | |
| High-level Planning | ReasonMap S (short questions) | Weighted Accuracy15.44 | 3 |