VoroNav: Voronoi-based Zero-shot Object Navigation with Large Language Model

About

In the realm of household robotics, the Zero-Shot Object Navigation (ZSON) task empowers agents to adeptly traverse unfamiliar environments and locate objects from novel categories without prior explicit training. This paper introduces VoroNav, a novel semantic exploration framework that proposes the Reduced Voronoi Graph to extract exploratory paths and planning nodes from a semantic map constructed in real time. By harnessing topological and semantic information, VoroNav designs text-based descriptions of paths and images that are readily interpretable by a large language model (LLM). In particular, our approach presents a synergy of path and farsight descriptions to represent the environmental context, enabling LLM to apply commonsense reasoning to ascertain waypoints for navigation. Extensive evaluation on HM3D and HSSD validates VoroNav surpasses existing benchmarks in both success rate and exploration efficiency (absolute improvement: +2.8% Success and +3.7% SPL on HM3D, +2.6% Success and +3.8% SPL on HSSD). Additionally introduced metrics that evaluate obstacle avoidance proficiency and perceptual efficiency further corroborate the enhancements achieved by our method in ZSON planning. Project page: https://voro-nav.github.io

Pengying Wu, Yao Mu, Bingxian Wu, Yi Hou, Ji Ma, Shanghang Zhang, Chang Liu• 2024

Related benchmarks

Task	Dataset	Result
Object Goal Navigation	HM3D	Success Rate42	96
Object Goal Navigation	HM3D v1 (val)	Success Rate (SR)42	65
Object Navigation	HM3D v1	SR42	49
Object Goal Navigation	HM3D (test)	SR42	37
Object Goal Navigation	HM3D 0.1	SR42	35
Object Goal Navigation	HM3D (val)	SR42	21
Object Navigation	HM3D v0.1	Success Rate (SR)42	18
Object Navigation	HM3D v1 (test)	Success Rate (SR)42	17
Embodied Navigation	HSSD	Success Rate41	15
Object Goal Navigation	HM3D (1000 episodes)	Success Rate (SR)42	13

Showing 10 of 12 rows

Other info

Follow for update

@wizwand_team Discord