AR Forcing: Towards Long-Horizon Robot Navigation World Model
About
The diffusion based robot navigation world models are typically trained using parallel supervision, while autoregressive inference is employed during path planning. This results in a distribution shift between training and inference, which destabilizes the performance over long-horizon prediction. We propose AR Forcing, an autoregressive training strategy, which integrates the standard diffusion loss into the autoregressive training loop. At each step, the model uses its own predictions to update the context and optimize the single step noise prediction objective, thereby explicitly exposing the model to the inference state distribution during training. Our method does not require additional discriminators or distribution-matching losses, retains the original diffusion framework and sampler, and is easy to integrate. Experiments on multi-domain navigation datasets (RECON, SCAND, HuRoN, TartanDrive) show that compared with strong baselines, AR Forcing improved the consistency of generated images during long-horizon navigation and the accuracy of predicted trajectories, enhancing robustness of the model in complex known and unknown environments. We will release the code soon.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Long-horizon prediction | RECON | LPIPS0.261 | 10 | |
| Long-horizon prediction | TartanDrive | LPIPS0.334 | 10 | |
| Long-horizon prediction | SCAND | LPIPS0.396 | 10 | |
| Long-horizon prediction | HuRON | LPIPS0.27 | 10 | |
| Goal Conditioned Visual Navigation | Goal-Conditioned Visual Navigation 2 seconds horizon | ATE1.22 | 6 | |
| Goal Conditioned Visual Navigation | RECON (4s horizon) | ATE1.69 | 2 | |
| Goal Conditioned Visual Navigation | RECON (8s horizon) | ATE5.8 | 2 | |
| Goal Conditioned Visual Navigation | HuRoN 4s horizon | ATE9.23 | 2 | |
| Goal Conditioned Visual Navigation | HuRoN 8s horizon | ATE28.76 | 2 | |
| Goal Conditioned Visual Navigation | HuRoN 16s horizon | ATE56.7 | 2 |