ReflAct: World-Grounded Decision Making in LLM Agents via Goal-State Reflection

About

Recent advances in LLM agents have largely built on reasoning backbones like ReAct, which interleave thought and action in complex environments. However, ReAct often produces ungrounded or incoherent reasoning steps, leading to misalignment between the agent's actual state and goal. Our analysis finds that this stems from ReAct's inability to maintain consistent internal beliefs and goal alignment, causing compounding errors and hallucinations. To address this, we introduce ReflAct, a novel backbone that shifts reasoning from merely planning next actions to continuously reflecting on the agent's state relative to its goal. By explicitly grounding decisions in states and enforcing ongoing goal alignment, ReflAct dramatically improves strategic reliability. This design delivers substantial empirical gains: ReflAct surpasses ReAct by 27.7% on average, achieving a 93.3% success rate in ALFWorld. Notably, ReflAct even outperforms ReAct with added enhancement modules (e.g., Reflexion, WKM), showing that strengthening the core reasoning backbone is key to reliable agent performance.

Jeonghye Kim, Sojeong Rhee, Minbeom Kim, Dohyung Kim, Sangmook Lee, Youngchul Sung, Kyomin Jung• 2025

Related benchmarks

Task	Dataset	Result
Interactive Decision-making	AlfWorld	Overall Success Rate82.1	398
Embodied Task	AlfWorld	Overall Success Rate47.1	183
Interactive web-based shopping tasks	Webshop	Success Rate16	80
Web Shopping Agent	Webshop	Success Rate (SR)31	72
Embodied Agent Task	ALFWorld Unseen	Success Rate44.8	40
Agent Task Completion	τ-Bench Retail	Success Rate61.4	31
OS Task	Lifelong Agent Bench OS Task	Success Rate (Last Epoch)59.3	31
Embodied Agent Task	ScienceWorld Seen	Success Rate55	18
Embodied Agent Task	ScienceWorld Unseen	Success Rate50.9	18
Embodied Agent Task	ALFWorld Seen	Success Rate (%)52.1	18

Showing 10 of 16 rows

Other info

Follow for update

@wizwand_team Discord