FOM-Nav: Frontier-Object Maps for Object Goal Navigation

About

This paper addresses the Object Goal Navigation problem, where a robot must efficiently find a target object in an unknown environment. Existing implicit memory-based methods struggle with long-term memory retention and planning, while explicit map-based approaches lack rich semantic information. To address these challenges, we propose FOM-Nav, a modular framework that enhances exploration efficiency through Frontier-Object Maps and vision-language models. Our Frontier-Object Maps are built online and jointly encode spatial frontiers and fine-grained object information. Using this representation, a vision-language model performs multimodal scene understanding and high-level goal prediction, which is executed by a low-level planner for efficient trajectory generation. To train FOM-Nav, we automatically construct large-scale navigation datasets from real-world scanned environments. Extensive experiments validate the effectiveness of our model design and constructed dataset. FOM-Nav achieves state-of-the-art performance on the MP3D and HM3D benchmarks, particularly in navigation efficiency metric SPL, and yields promising results on a real robot.

Thomas Chabal, Shizhe Chen, Jean Ponce, Cordelia Schmid• 2025

Related benchmarks

Task	Dataset	Result
ObjectGoal Navigation	MP3D (val)	Success Rate35	68
Object Goal Navigation	HM3D v1 (val)	Success Rate (SR)57.3	65
Object Navigation	HM3D v2 (val)	SR75.8	19
Object Navigation	OVON v1 (val)	SR (seen)42.5	6
Object Navigation	HM3D v1sub (val)	Success Rate (SR)0.73	5
Object Navigation	MP3D sub (val)	SR44.6	4

Showing 6 of 6 rows

Other info

Follow for update

@wizwand_team Discord