ReEXplore: Improving MLLMs for Embodied Exploration with Contextualized Retrospective Experience Replay

About

Embodied exploration is a target-driven process that requires embodied agents to possess fine-grained perception and knowledge-enhanced decision making. While recent attempts leverage MLLMs for exploration due to their strong perceptual and reasoning abilities, we find that MLLM-based embodied agents remain suboptimal in exploring new environments: (i) they rely on profound but stale pre-trained knowledge, (ii) training-based approaches such as imitation learning or reinforcement learning are expensive for long-horizon tasks with sparse outcome rewards, and (iii) frontier-based exploration yields a large, visually nuanced action space that is difficult for MLLMs to make reliable decisions. We address these challenges with ReEXplore, a training-free framework that performs retrospective experience replay to inject distilled, abstract experience at inference time, and hierarchical frontier selection to decompose frontier ranking into coarse-to-fine decisions. Our approach enables robust, traceable, and efficient exploration. Across multiple embodied exploration benchmarks, ReEXplore yields great improvements over strong MLLM baselines, up to 3x higher performance in both success rate and in navigation efficiency under open-source backbones.

Gengyuan Zhang, Mingcong Ding, Jingpei Wu, Ruotong Liao, Volker Tresp• 2025

Related benchmarks

Task	Dataset	Result
Multi-Modal Lifelong Navigation	GOAT-Bench unseen (val)	SR59.8	37
Embodied Question Answering	A-EQA	Overall (LLM-Match)58.3	33
Lifelong Visual Navigation	GOAT-Bench 1/10-scale subset (val-unseen)	Success Rate59.8	13

Showing 3 of 3 rows

Other info

Follow for update

@wizwand_team Discord