PONI: Potential Functions for ObjectGoal Navigation with Interaction-free Learning

About

State-of-the-art approaches to ObjectGoal navigation rely on reinforcement learning and typically require significant computational resources and time for learning. We propose Potential functions for ObjectGoal Navigation with Interaction-free learning (PONI), a modular approach that disentangles the skills of `where to look?' for an object and `how to navigate to (x, y)?'. Our key insight is that `where to look?' can be treated purely as a perception problem, and learned without environment interactions. To address this, we propose a network that predicts two complementary potential functions conditioned on a semantic map and uses them to decide where to look for an unseen object. We train the potential function network using supervised learning on a passive dataset of top-down semantic maps, and integrate it into a modular framework to perform ObjectGoal navigation. Experiments on Gibson and Matterport3D demonstrate that our method achieves the state-of-the-art for ObjectGoal navigation while incurring up to 1,600x less computational cost for training. Code and pre-trained models are available: https://vision.cs.utexas.edu/projects/poni/

Santhosh Kumar Ramakrishnan, Devendra Singh Chaplot, Ziad Al-Halah, Jitendra Malik, Kristen Grauman• 2022

Related benchmarks

Task	Dataset	Result
Object Goal Navigation	MP3D	SR31.8	129
ObjectGoal Navigation	MP3D (val)	Success Rate31.8	68
ObjectNav	Gibson (val)	Success Rate73.6	22
Object Goal Navigation	MP3D (test)	SPL0.121	18
Object Navigation	MP3D (val)	Success Rate (SR)31.8	18
ObjectGoal Navigation	MP3D (test-std)	Success Rate20.01	11
Object Goal Navigation	MP3D (1000 episodes)	Success Rate (SR)31.8	8

Showing 7 of 7 rows

Other info

Code

Follow for update

@wizwand_team Discord