Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

StepNav: Structured Trajectory Priors for Efficient and Multimodal Visual Navigation

About

Visual navigation is fundamental to autonomous systems, yet generating reliable trajectories in cluttered and uncertain environments remains a core challenge. Recent generative models promise end-to-end synthesis, but their reliance on unstructured noise priors often yields unsafe, inefficient, or unimodal plans that cannot meet real-time requirements. We propose StepNav, a novel framework that bridges this gap by introducing structured, multimodal trajectory priors derived from variational principles. StepNav first learns a geometry-aware success probability field to identify all feasible navigation corridors. These corridors are then used to construct an explicit, multi-modal mixture prior that initializes a conditional flow-matching process. This refinement is formulated as an optimal control problem with explicit smoothness and safety regularization. By replacing unstructured noise with physically-grounded candidates, StepNav generates safer and more efficient plans in significantly fewer steps. Experiments in both simulation and real-world benchmarks demonstrate consistent improvements in robustness, efficiency, and safety over state-of-the-art generative planners, advancing reliable trajectory generation for practical autonomous navigation. The code has been released at https://github.com/LuoXubo/StepNav.

Xubo Luo, Aodi Wu, Haodong Han, Xue Wan, Wei Zhang, Leizheng Shu, Ruisuo Wang• 2026

Related benchmarks

TaskDatasetResultRank
Point-Goal navigationStanford 2D-3D-S Indoor Basic Task
SR0.95
5
Point-Goal navigationStanford 2D-3D-S Indoor (Adaptation Task)
Success Rate90
5
Point-Goal navigationCitysim Outdoor Basic Task
SR (%)0.57
5
Point-Goal navigationCitysim Outdoor Adaptation Task
SR68
5
Showing 4 of 4 rows

Other info

Follow for update