Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

ACON: Optimizing Context Compression for Long-horizon LLM Agents

About

Large language models (LLMs) are increasingly deployed as agents in dynamic real-world environments, where success depends on maintaining precise records of actions and observations. However, the resulting unbounded context growth in long-horizon agentic tasks makes two critical bottlenecks: prohibitive inference memory costs and reasoning degradation due to irrelevant information. Existing compression methods fail to fully address this, often relying on brittle heuristics or requiring parameter updates impractical for proprietary or large-scale LLMs. We introduce Agent Context Optimization (ACON), a unified framework that optimally compresses both observations and history into concise, informative representations. Distinct from prior works, ACON employs an optimization in natural language space: it iteratively refines compression guidelines based on failure analysis of the agent, ensuring critical state information is preserved without model fine-tuning. To further minimize computational overhead, we distill the optimized compressor into smaller models. Experiments on AppWorld, OfficeBench, and Multi-objective QA demonstrate that ACON reduces peak token usage by 26-54% while improving task success over existing compression baselines. Notably, it enables smaller LMs to function effectively as long-horizon agents, achieving up to 46% performance improvement by mitigating context distraction. Our code is available at https://github.com/microsoft/acon.

Minki Kang, Wei-Ning Chen, Dongge Han, Huseyin A. Inan, Lukas Wutschitz, Yanzhi Chen, Robert Sim, Saravan Rajmohan• 2025

Related benchmarks

TaskDatasetResultRank
Mean RewardWebshop
Mean Reward53.3
30
Mean RewardAlfWorld
Mean Reward0.4
30
Mean RewardScienceWorld
Mean Reward0.172
30
Agentic Task CompletionAppWorld (test-normal)
Accuracy56.5
22
Interactive Agent TaskWebshop
Efficiency Multiplier14
15
Interactive Agent TaskScienceWorld
Efficiency Factor3.4
15
Interactive Agent TaskAlfWorld
Effective Steps Multiplier3.3
15
Multi-step ReasoningTriviaQA
Task Performance57.14
14
Web-based tool-useMind2Web
Task Performance30.77
12
Agentic Task CompletionAppWorld Easy normal (test)
Accuracy86
11
Showing 10 of 15 rows

Other info

Follow for update