RePlan-Bot: Multi-Level Replanning for Embodied Instruction Following

About

Embodied instruction following (EIF) requires agents to understand and execute complex natural language commands within interactive 3D environments. Despite recent advances, existing methods often fail in long-horizon planning and handling irreversible state changes, resulting in low task success rates. To address these challenges, we introduce RePlan-Bot, a novel EIF agent that performs multi-level, continuous replanning throughout task execution. RePlan-Bot integrates a high-level LLM-based auditor for dynamic sub-goal adjustments guided by environmental feedback, a commonsense-guided search mechanism based on a multi-layered instance map for precise and structured object localization, and a lightweight ViT-based corrector to preemptively fix risky low-level actions. Evaluated on the ALFRED benchmark, RePlan-Bot achieves state-of-the-art performance in both seen and unseen environments, demonstrating superior adaptability and reliability.

Xicheng Gong, Guozheng Sun, Peiran Xu, Yadong Mu• 2026

Related benchmarks

Task	Dataset	Result	Rank
Embodied Task Completion	ALFRED seen (test)	Success Rate (SR)52.05		26
Embodied Task Completion	ALFRED unseen (test)	Success Rate47.61		26

Showing 2 of 2 rows

Other info

Follow for update

@wizwand_team Discord