FeudalNav: A Simple Framework for Visual Navigation

About

Visual navigation for robotics is inspired by the human ability to navigate environments using visual cues and memory, eliminating the need for detailed maps. In unseen, unmapped, or GPS-denied settings, traditional metric map-based methods fall short, prompting a shift toward learning-based approaches with minimal exploration. In this work, we develop a hierarchical framework that decomposes the navigation decision-making process into multiple levels. Our method learns to select subgoals through a simple, transferable waypoint selection network. A key component of the approach is a latent-space memory module organized solely by visual similarity, as a proxy for distance. This alternative to graph-based topological representations proves sufficient for navigation tasks, providing a compact, light-weight, simple-to-train navigator that can find its way to the goal in novel locations. We show competitive results with a suite of SOTA methods in Habitat AI environments without using any odometry in training or inference. An additional contribution leverages the interpretablility of the framework for interactive navigation. We consider the question: how much direction intervention/interaction is needed to achieve success in all trials? We demonstrate that even minimal human involvement can significantly enhance overall navigation performance.

Faith Johnson, Bryan Bo Cao, Shubham Jain, Ashwin Ashok, Kristin Dana• 2026

Related benchmarks

Task	Dataset	Result
Image-Goal Navigation	Gibson (test)	Succ (Average)80.78	17
Image-Goal Navigation	Gibson Curved trajectories (unseen)	Succ (Easy)72.5	12
Image-Goal Navigation	Gibson Straight trajectories (unseen)	Success Rate (Easy)82.6	10

Showing 3 of 3 rows

Other info

Follow for update

@wizwand_team Discord