Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Agent-Dice: Disentangling Knowledge Updates via Geometric Consensus for Agent Continual Learning

About

Large Language Model (LLM)-based agents significantly extend the utility of LLMs by interacting with dynamic environments. However, enabling agents to continually learn new tasks without catastrophic forgetting remains a critical challenge, known as the stability-plasticity dilemma. In this work, we argue that this dilemma fundamentally arises from the failure to explicitly distinguish between common knowledge shared across tasks and conflicting knowledge introduced by task-specific interference. To address this, we propose Agent-Dice, a parameter fusion framework based on directional consensus evaluation. Concretely, Agent-Dice disentangles knowledge updates through a two-stage process: geometric consensus filtering to prune conflicting gradients, and curvature-based importance weighting to amplify shared semantics. We provide a rigorous theoretical analysis that establishes the validity of the proposed fusion scheme and offers insight into the origins of the stability-plasticity dilemma. Extensive experiments on GUI agents and tool-use agent domains demonstrate that Agent-Dice exhibits outstanding continual learning performance with minimal computational overhead and parameter updates. The codes are available at https://github.com/Wuzheng02/Agent-Dice.

Zheng Wu, Xingyu Lou, Xinbei Ma, Yansi Li, Weiwen Liu, Weinan Zhang, Jun Wang, Zhuosheng Zhang• 2026

Related benchmarks

TaskDatasetResultRank
GUI AgentAITZ
SR57.1
20
Tool UseTool-use domain Aggregate
AvgZ Score0.79
18
Tool UseTool-use domain Subset 3
Functionality Score100
18
Tool UseTool-use domain Subset 2
Func Success Rate99.26
18
Tool UseTool-use domain Subset 0
Func Success Rate99.29
18
Tool UseTool-use domain Subset 1
Func99.63
18
GUI Agent Navigation and ActionAITZ
Type Accuracy68.28
7
GUI AgentAndroidControl
Type80.03
7
GUI AgentGUI-Odyssey
Type Accuracy89.27
7
GUI Agent Navigation and ActionAndroidControl
Type Rate79.39
7
Showing 10 of 12 rows

Other info

Follow for update