MerNav: A Highly Generalizable Memory-Execute-Review Framework for Zero-Shot Object Goal Navigation

About

Visual Language Navigation (VLN) is one of the fundamental capabilities for embodied intelligence and a critical challenge that urgently needs to be addressed. However, existing methods are still unsatisfactory in terms of both success rate (SR) and generalization: Supervised Fine-Tuning (SFT) approaches typically achieve higher SR, while Training-Free (TF) approaches often generalize better, but it is difficult to obtain both simultaneously. To this end, we propose a Memory-Execute-Review framework. It consists of three parts: a hierarchical memory module for providing information support, an execute module for routine decision-making and actions, and a review module for handling abnormal situations and correcting behavior. We validated the effectiveness of this framework on the Object Goal Navigation task. Across 4 datasets, our average SR achieved absolute improvements of 7% and 5% compared to all baseline methods under TF and Zero-Shot (ZS) settings, respectively. On the most commonly used HM3D_v0.1 and the more challenging open vocabulary dataset HM3D_OVON, the SR improved by 8% and 6%, under ZS settings. Furthermore, on the MP3D and HM3D_OVON datasets, our method not only outperformed all TF methods but also surpassed all SFT methods, achieving comprehensive leadership in both SR (5% and 2%) and generalization. Additionally, we deployed the MerNav model on the humanoid robot and conducted experiments in the real world. The project address is: https://qidekang.github.io/MerNav.github.io/

Dekang Qi, Shuang Zeng, Xinyuan Chang, Feng Xiong, Shichao Xie, Xiaolong Wu, Mu Xu• 2026

Related benchmarks

Task	Dataset	Result
Object Goal Navigation	MP3D	SR50.8	172
Object Goal Navigation	HM3D 0.1	SR68	35
Open-Vocabulary Object Goal Navigation	HM3D OVON (test)	SR45.7	17
Object Goal Navigation	HM3D OVON	SR45.7	14
Object Goal Navigation	HM3D 0.2	SR74.8	5
Open-Vocabulary Object Goal Navigation	MP3D (test)	Success Rate (SR)50.8	4

Showing 6 of 6 rows

Other info

Follow for update

@wizwand_team Discord