ACON: Optimizing Context Compression for Long-horizon LLM Agents

About

Large language models (LLMs) are increasingly deployed as agents in dynamic real-world environments, where success depends on maintaining precise records of actions and observations. However, the resulting unbounded context growth in long-horizon agentic tasks makes two critical bottlenecks: prohibitive inference memory costs and reasoning degradation due to irrelevant information. Existing compression methods fail to fully address this, often relying on brittle heuristics or requiring parameter updates impractical for proprietary or large-scale LLMs. We introduce Agent Context Optimization (ACON), a unified framework that optimally compresses both observations and history into concise, informative representations. Distinct from prior works, ACON employs an optimization in natural language space: it iteratively refines compression guidelines based on failure analysis of the agent, ensuring critical state information is preserved without model fine-tuning. To further minimize computational overhead, we distill the optimized compressor into smaller models. Experiments on AppWorld, OfficeBench, and Multi-objective QA demonstrate that ACON reduces peak token usage by 26-54% while improving task success over existing compression baselines. Notably, it enables smaller LMs to function effectively as long-horizon agents, achieving up to 46% performance improvement by mitigating context distraction. Our code is available at https://github.com/microsoft/acon.

Minki Kang, Wei-Ning Chen, Dongge Han, Huseyin A. Inan, Lukas Wutschitz, Yanzhi Chen, Robert Sim, Saravan Rajmohan• 2025

Related benchmarks

Task	Dataset	Result
Web Navigation and Shopping	Webshop	Score62.7	248
Mean Reward	Webshop	Mean Reward53.3	30
Mean Reward	AlfWorld	Mean Reward0.4	30
Mean Reward	ScienceWorld	Mean Reward0.172	30
Agentic Task Completion	AppWorld (test-normal)	Accuracy56.5	22
Interactive Agent Task	Webshop	Efficiency Multiplier14	15
Interactive Agent Task	ScienceWorld	Efficiency Factor3.4	15
Interactive Agent Task	AlfWorld	Effective Steps Multiplier3.3	15
Multi-step Reasoning	TriviaQA	Task Performance57.14	14
Web-based tool-use	Mind2Web	Task Performance30.77	12

Showing 10 of 28 rows

Other info

Follow for update

@wizwand_team Discord