A Hierarchical Error-Corrective Graph Framework for Autonomous Agents with LLM-Based Action Generation
About
We propose a Hierarchical Error-Corrective Graph FrameworkforAutonomousAgentswithLLM-BasedActionGeneration(HECG),whichincorporates three core innovations: (1) Multi-Dimensional Transferable Strategy (MDTS): by integrating task quality metrics (Q), confidence/cost metrics (C), reward metrics (R), and LLM-based semantic reasoning scores (LLM-Score), MDTS achieves multi-dimensional alignment between quantitative performance and semantic context, enabling more precise selection of high-quality candidate strate gies and effectively reducing the risk of negative transfer. (2) Error Matrix Classification (EMC): unlike simple confusion matrices or overall performance metrics, EMC provides structured attribution of task failures by categorizing errors into ten types, such as Strategy Errors (Strategy Whe) and Script Parsing Errors (Script-Parsing-Error), and decomposing them according to severity, typical actions, error descriptions, and recoverability. This allows precise analysis of the root causes of task failures, offering clear guidance for subsequent error correction and strategy optimization rather than relying solely on overall success rates or single performance metrics. (3) Causal-Context Graph Retrieval (CCGR): to enhance agent retrieval capabilities in dynamic task environments, we construct graphs from historical states, actions, and event sequences, where nodes store executed actions, next-step actions, execution states, transferable strategies, and other relevant information, and edges represent causal dependencies such as preconditions for transitions between nodes. CCGR identifies subgraphs most relevant to the current task context, effectively capturing structural relationships beyond vector similarity, allowing agents to fully leverage contextual information, accelerate strategy adaptation, and improve execution reliability in complex, multi-step tasks.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Putdishwasher | Household Tasks bedroom and kitchen | -- | 6 | |
| Preparefood | Household Tasks kitchen_and_livingroom | -- | 3 | |
| Preparefood | Household Tasks bedroom_and_bathroom | -- | 3 | |
| Preparefood | VirtualHome kitchen and livingroom | -- | 3 | |
| Preparefood | VirtualHome bedroom_and_bathroom | -- | 3 | |
| Putdishwasher | Household Tasks livingroom_and_bedroom | -- | 3 | |
| Putdishwasher | VirtualHome livingroom_and_bedroom | -- | 3 | |
| Putfridge | Household Tasks bathroom and livingroom | -- | 3 | |
| Putfridge | Household Tasks kitchen_and_bathroom | -- | 3 | |
| Putfridge | VirtualHome bathroom_and_livingroom | -- | 3 |