| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Vision-Language Navigation | R2R-CE (val-unseen) | Success Rate (SR)68 | 433 | |
| Vision-and-Language Navigation | R2R (val unseen) | Success Rate (SR)82.46 | 344 | |
| Vision-Language Navigation | R2R Unseen (test) | SR86 | 134 | |
| Vision-Language Navigation | R2R (test unseen) | SR86 | 122 | |
| Vision-Language Navigation | R2R (val seen) | Success Rate (SR)7,540 | 120 | |
| Vision-and-Language Navigation | R2R (val seen) | Success Rate (SR)83.74 | 68 | |
| Vision-and-Language Navigation | R2R-CE (test-unseen) | SR66 | 63 | |
| Vision-and-Language Navigation | R2R (test) | SPL (Success weighted Path Length)76 | 49 | |
| Vision-Language Navigation | R2R unseen v1.0 (val) | SR3,130 | 37 | |
| Embodied Navigation | R2R-CE | Navigation Error (NE)4.73 | 19 | |
| Vision-Language Navigation | R2R VLN-PE (val unseen) | Navigation Error (NE)4.33 | 18 | |
| Vision-Language Navigation | R2R 1 (test unseen) | Success Rate0.76 | 18 | |
| Vision-Language Navigation | R2R VLN-PE (val seen) | Navigation Error (NE)4.1 | 17 | |
| Vision-Language Navigation | R2R VLN Challenge Leaderboard (test) | PL1,257.38 | 16 | |
| Vision-and-Language Navigation | R2R (Val-U) | SPL66 | 13 | |
| Vision-and-Language Navigation | R2R | Success Rate (SR)43.7 | 12 | |
| Vision-and-Language Navigation | R2R Discrete (val-unseen) | Navigation Error (NE)2.09 | 12 | |
| Instruction Following | R2R unseen (test) | Success Rate (SR)62.2 | 11 | |
| Vision-Language Navigation | R2R 1 (val seen) | Navigation Error (NE)1.67 | 10 | |
| Vision-Language Navigation | R2R Unseen House (val) | Navigation Error (NE)4.83 | 9 | |
| Vision-and-Language Navigation | R2R generalization (unseen) | SR35.2 | 8 | |
| Instruction Generation | R2R (test) | SR76 | 7 | |
| Human Wayfinding | R2R (val-unseen) | WC24.5 | 6 | |
| Vision-and-Language Navigation | R2R unseen complementary (val) | Path Length (PL)7.8 | 6 | |
| Vision-Language Navigation | R2R | FPS4.68 | 5 |