Reflecting with Two Voices: A Co-Adaptive Dual-Strategy Framework for LLM-Based Agent Decision Making
About
Large language model (LLM) agents often rely on external demonstrations or retrieval-augmented planning, leading to brittleness, poor generalization, and high computational overhead. Inspired by human problem-solving, we propose DuSAR (Dual-Strategy Agent with Reflecting) -- a demonstration-free framework that enables a single frozen LLM to perform co-adaptive reasoning via two complementary strategies: a high-level holistic plan and a context-grounded local policy. These strategies interact through a lightweight reflection mechanism, where the agent continuously assesses progress via a Strategy Fitness Score and dynamically revises its global plan when stuck or refines it upon meaningful advancement, mimicking human metacognitive behavior. On both simulated household (ALFWorld) and real-world web (Mind2Web) environments, DuSAR achieves state-of-the-art performance using only open-source LLMs, substantially outperforming all prior methods without any demonstrations or fine-tuning. Remarkably, it also reduces per-step token consumption by a large margin while maintaining strong task success. Ablation studies confirm the necessity of dual-strategy coordination. Moreover, optional integration of expert demonstrations further boosts performance, highlighting DuSAR's flexibility and compatibility with external knowledge.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| GUI Navigation | Mind2Web (Cross-Website) | Element Accuracy44.6 | 23 | |
| Web Action Generation Efficiency | Mind2Web Cross-Task | Time to Procedure378.2 | 16 | |
| Web Action Generation Efficiency | Mind2Web (Cross-Website) | To_Pro Steps/Time364.1 | 16 | |
| Web Action Generation Efficiency | Mind2Web Cross-Domain | To_Pro (Steps/Time)334.9 | 16 | |
| Web Action Generation Efficiency | Mind2Web (All) | Time to Proposal Steps363.6 | 16 | |
| Web Agent Navigation | MIND2WEB Cross-Task 1.0 | Element Accuracy54.9 | 16 | |
| Web Agent Navigation | MIND2WEB Cross-Domain 1.0 | Element Accuracy47.4 | 16 | |
| Web Agent Navigation | Mind2Web All 1.0 | Element Accuracy0.484 | 16 | |
| Household simulation | ALFWorld (out-of-distribution) | Put Success Rate75 | 12 |