Reflecting with Two Voices: A Co-Adaptive Dual-Strategy Framework for LLM-Based Agent Decision Making

About

Large language model (LLM) agents often rely on external demonstrations or retrieval-augmented planning, leading to brittleness, poor generalization, and high computational overhead. Inspired by human problem-solving, we propose DuSAR (Dual-Strategy Agent with Reflecting) -- a demonstration-free framework that enables a single frozen LLM to perform co-adaptive reasoning via two complementary strategies: a high-level holistic plan and a context-grounded local policy. These strategies interact through a lightweight reflection mechanism, where the agent continuously assesses progress via a Strategy Fitness Score and dynamically revises its global plan when stuck or refines it upon meaningful advancement, mimicking human metacognitive behavior. On both simulated household (ALFWorld) and real-world web (Mind2Web) environments, DuSAR achieves state-of-the-art performance using only open-source LLMs, substantially outperforming all prior methods without any demonstrations or fine-tuning. Remarkably, it also reduces per-step token consumption by a large margin while maintaining strong task success. Ablation studies confirm the necessity of dual-strategy coordination. Moreover, optional integration of expert demonstrations further boosts performance, highlighting DuSAR's flexibility and compatibility with external knowledge.

Wentao Zhang, Qunbo Wang, BoXuan Zhao, Tao Zhang, Junsheng Wu, Hongping Gan, Ling Dai, Shizhuang Deng, Shuntong Sun, Yang Liu• 2025

Related benchmarks

Task	Dataset	Result
Web Agent Navigation	MIND2WEB Cross-Domain 1.0	Success Rate445	26
Web Agent Navigation	MIND2WEB Cross-Task 1.0	Success Rate5.28	26
GUI Navigation	Mind2Web (Cross-Website)	Element Accuracy44.6	23
Web Action Generation Efficiency	Mind2Web Cross-Task	Time to Procedure378.2	16
Web Action Generation Efficiency	Mind2Web (Cross-Website)	To_Pro Steps/Time364.1	16
Web Action Generation Efficiency	Mind2Web Cross-Domain	To_Pro (Steps/Time)334.9	16
Web Action Generation Efficiency	Mind2Web (All)	Time to Proposal Steps363.6	16
Web Agent Navigation	Mind2Web All 1.0	Element Accuracy0.484	16
Household simulation	ALFWorld (out-of-distribution)	Put Success Rate75	12

Showing 9 of 9 rows

Other info

Follow for update

@wizwand_team Discord