ReflAct: World-Grounded Decision Making in LLM Agents via Goal-State Reflection
About
Recent advances in LLM agents have largely built on reasoning backbones like ReAct, which interleave thought and action in complex environments. However, ReAct often produces ungrounded or incoherent reasoning steps, leading to misalignment between the agent's actual state and goal. Our analysis finds that this stems from ReAct's inability to maintain consistent internal beliefs and goal alignment, causing compounding errors and hallucinations. To address this, we introduce ReflAct, a novel backbone that shifts reasoning from merely planning next actions to continuously reflecting on the agent's state relative to its goal. By explicitly grounding decisions in states and enforcing ongoing goal alignment, ReflAct dramatically improves strategic reliability. This design delivers substantial empirical gains: ReflAct surpasses ReAct by 27.7% on average, achieving a 93.3% success rate in ALFWorld. Notably, ReflAct even outperforms ReAct with added enhancement modules (e.g., Reflexion, WKM), showing that strengthening the core reasoning backbone is key to reliable agent performance.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Interactive Decision-making | AlfWorld | Overall Success Rate82.1 | 295 | |
| Embodied Task | AlfWorld | Overall Success Rate47.1 | 169 | |
| Interactive web-based shopping tasks | Webshop | Score37.5 | 60 | |
| Web Shopping Agent | Webshop | Score52 | 53 | |
| Embodied Agent Task | ALFWorld Unseen | Success Rate44.8 | 40 | |
| Embodied Agent Task | ScienceWorld Seen | Success Rate55 | 18 | |
| Embodied Agent Task | ScienceWorld Unseen | Success Rate50.9 | 18 | |
| Embodied Agent Task | ALFWorld Seen | Success Rate (%)52.1 | 18 | |
| Embodied Agent Task | VirtualHome Unseen | Success Rate12.8 | 18 | |
| Embodied Task Planning | VirtualHome (Seen) | Success Rate12.8 | 18 |