Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Navigating to Objects Specified by Images

About

Images are a convenient way to specify which particular object instance an embodied agent should navigate to. Solving this task requires semantic visual reasoning and exploration of unknown environments. We present a system that can perform this task in both simulation and the real world. Our modular method solves sub-tasks of exploration, goal instance re-identification, goal localization, and local navigation. We re-identify the goal instance in egocentric vision using feature-matching and localize the goal instance by projecting matched features to a map. Each sub-task is solved using off-the-shelf components requiring zero fine-tuning. On the HM3D InstanceImageNav benchmark, this system outperforms a baseline end-to-end RL policy 7x and a state-of-the-art ImageNav model 2.3x (56% vs 25% success). We deploy this system to a mobile robot platform and demonstrate effective real-world performance, achieving an 88% success rate across a home and an office environment.

Jacob Krantz, Theophile Gervet, Karmesh Yadav, Austin Wang, Chris Paxton, Roozbeh Mottaghi, Dhruv Batra, Jitendra Malik, Stefan Lee, Devendra Singh Chaplot• 2023

Related benchmarks

TaskDatasetResultRank
Instance Image-Goal NavigationHM3D v3 (val)
Success Rate (SR)56.1
15
Instance Image-Goal NavigationHM3D
SR56.1
8
Image-Goal NavigationHM3D Instance ImageNav (test)
SR56.1
8
Image-Goal NavigationHM3D challenge
Success Rate (SR)56.1
7
Showing 4 of 4 rows

Other info

Follow for update