Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Toward Formalizing LLM-Based Agent Designs through Structural Context Modeling and Semantic Dynamics Analysis

About

Current research on large language model (LLM) agents is fragmented: discussions of conceptual frameworks and methodological principles are frequently intertwined with low-level implementation details, causing both readers and authors to lose track amid a proliferation of superficially distinct concepts. We argue that this fragmentation largely stems from the absence of an analyzable, self-consistent formal model that enables implementation-independent characterization and comparison of LLM agents. To address this gap, we propose the \texttt{Structural Context Model}, a formal model for analyzing and comparing LLM agents from the perspective of context structure. Building upon this foundation, we introduce two complementary components that together span the full lifecycle of LLM agent research and development: (1) a declarative implementation framework; and (2) a sustainable agent engineering workflow, \texttt{Semantic Dynamics Analysis}. The proposed workflow provides principled insights into agent mechanisms and supports rapid, systematic design iteration. We demonstrate the effectiveness of the complete framework on dynamic variants of the monkey-banana problem, where agents engineered using our approach achieve up to a 32 percentage points improvement in success rate on the most challenging setting.

Haoyu Jia, Kento Kawaharazuka, Kei Okada• 2026

Related benchmarks

TaskDatasetResultRank
Monkey-banana task executionDual Bananas Scene 4
Success Rate100
6
Monkey-banana task executionDual Bananas Scene 5
Success Rate95
6
Robotic PlanningComprehensive Scene 13
SR100
6
Robotic PlanningComprehensive (Scene 14)
SR89
6
Robotic PlanningScene 15 Comprehensive
SR0.32
6
Robotic Task PlanningClassic (Scene 1)
Success Rate (SR)100
6
Robotic Task PlanningClassic (Scene 2)
Success Rate97
6
Robotic Task PlanningClassic (Scene 3)
Success Rate (SR)0.66
6
Task PlanningShortsighted Monkey (Scene 8)
Success Rate100
6
Task PlanningShortsighted Monkey (Scene 9)
SR0.56
6
Showing 10 of 12 rows

Other info

Follow for update