Visual Navigation with Spatial Attention

About

This work focuses on object goal visual navigation, aiming at finding the location of an object from a given class, where in each step the agent is provided with an egocentric RGB image of the scene. We propose to learn the agent's policy using a reinforcement learning algorithm. Our key contribution is a novel attention probability model for visual navigation tasks. This attention encodes semantic information about observed objects, as well as spatial information about their place. This combination of the "what" and the "where" allows the agent to navigate toward the sought-after object effectively. The attention model is shown to improve the agent's policy and to achieve state-of-the-art results on commonly-used datasets.

Bar Mayo, Tamir Hazan, Ayellet Tal• 2021

Related benchmarks

Task	Dataset	Result
Object Goal Navigation	iTHOR v1 (ALL)	Success Rate (SR)46.2	14
Object Goal Navigation	iTHOR v1 (L >= 5)	SR32.63	14
Object Goal Navigation	RoboTHOR v1 (all)	Success Rate (SR)13.57	12
Object Goal Navigation	RoboTHOR v1 (L >= 5)	Success Rate (SR)5.14	12
Visual Navigation	AI2-THOR Unseen Scenes (L >= 5) (test)	SPL0.1594	11
Visual Navigation	AI2-THOR (test)	SPL17.88	4

Showing 6 of 6 rows

Other info

Follow for update

@wizwand_team Discord