Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

MAP: A Map-then-Act Paradigm for Long-Horizon Interactive Agent Reasoning

About

Current interactive LLM agents rely on goal-conditioned stepwise planning, where environmental understanding is acquired reactively during execution rather than established beforehand. This temporal inversion leads to Delayed Environmental Perception: agents must infer environmental constraints through trial-and-error, resulting in an Epistemic Bottleneck that traps them in inefficient failure cycles. Inspired by human affordance perception and cognitive map theory, we propose the Map-then-Act Paradigm (MAP), a plug-and-play framework that shifts environment understanding before execution. MAP consists of three stages: (1) Global Exploration, acquiring environment-general priors; (2) Task-Specific Mapping, constructing a structured cognitive map; and (3) Knowledge-Augmented Execution, solving tasks grounded on the map. Experiments show consistent gains across benchmarks and LLMs. On ARC-AGI-3, MAP enables frontier models to surpass near-zero baseline performance in 22 of 25 game environments. We further introduce MAP-2K, a dataset of map-then-act trajectories, and show that training on it outperforms expert execution traces, suggesting that understanding environments is more fundamental than imitation.

Yuxin Liu, Ziang Ye, Yueqing Sun, Mingye Zhu, Jinwei Xiao, Zhuowen Han, Qi GU, Xunliang Cai, Lei Zhang• 2026

Related benchmarks

TaskDatasetResultRank
Interactive Decision-makingAlfWorld
Overall Success Rate99.6
295
Interactive Decision-makingTextCraft
Success Rate99.6
42
Interactive Decision-makingScienceWorld
Success Rate54.2
42
ExplorationARC-AGI 3
TU93 Level4
2
Showing 4 of 4 rows

Other info

Follow for update