| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Depth Estimation | environments (unseen) | DPE1.25 | 7 | |
| Sound Source Localization | environments (unseen) | SLE17 | 7 | |
| Depth Estimation | Environments (seen) | DPE164 | 6 | |
| Sound Source Localization | Environments (seen) | SLE14.9 | 6 | |
| Multi-step rollout prediction | 7 Environments (AlfWorld, BabyAI, Maze, SciWorld, TextCraft, WebShop, Wordle) (held-out episodes) | Token F1 (t=1)69 | 5 | |
| Neural Forward Frame Rendering | Six Training Environments (val) | PSNR20.96 | 2 |