Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Reflecting with Two Voices: A Co-Adaptive Dual-Strategy Framework for LLM-Based Agent Decision Making

About

Large language model (LLM) agents often rely on external demonstrations or retrieval-augmented planning, leading to brittleness, poor generalization, and high computational overhead. Inspired by human problem-solving, we propose DuSAR (Dual-Strategy Agent with Reflecting) -- a demonstration-free framework that enables a single frozen LLM to perform co-adaptive reasoning via two complementary strategies: a high-level holistic plan and a context-grounded local policy. These strategies interact through a lightweight reflection mechanism, where the agent continuously assesses progress via a Strategy Fitness Score and dynamically revises its global plan when stuck or refines it upon meaningful advancement, mimicking human metacognitive behavior. On both simulated household (ALFWorld) and real-world web (Mind2Web) environments, DuSAR achieves state-of-the-art performance using only open-source LLMs, substantially outperforming all prior methods without any demonstrations or fine-tuning. Remarkably, it also reduces per-step token consumption by a large margin while maintaining strong task success. Ablation studies confirm the necessity of dual-strategy coordination. Moreover, optional integration of expert demonstrations further boosts performance, highlighting DuSAR's flexibility and compatibility with external knowledge.

Wentao Zhang, Qunbo Wang, BoXuan Zhao, Tao Zhang, Junsheng Wu, Hongping Gan, Ling Dai, Shizhuang Deng, Shuntong Sun, Yang Liu• 2025

Related benchmarks

TaskDatasetResultRank
GUI NavigationMind2Web (Cross-Website)
Element Accuracy44.6
23
Web Action Generation EfficiencyMind2Web Cross-Task
Time to Procedure378.2
16
Web Action Generation EfficiencyMind2Web (Cross-Website)
To_Pro Steps/Time364.1
16
Web Action Generation EfficiencyMind2Web Cross-Domain
To_Pro (Steps/Time)334.9
16
Web Action Generation EfficiencyMind2Web (All)
Time to Proposal Steps363.6
16
Web Agent NavigationMIND2WEB Cross-Task 1.0
Element Accuracy54.9
16
Web Agent NavigationMIND2WEB Cross-Domain 1.0
Element Accuracy47.4
16
Web Agent NavigationMind2Web All 1.0
Element Accuracy0.484
16
Household simulationALFWorld (out-of-distribution)
Put Success Rate75
12
Showing 9 of 9 rows

Other info

Follow for update