Dynamic System Instructions and Tool Exposure for Efficient Agentic LLMs

About

Large Language Model (LLM) agents often run for many steps while re-ingesting long system instructions and large tool catalogs each turn. This increases cost, agent derailment probability, latency, and tool-selection errors. We propose Instruction-Tool Retrieval (ITR), a RAG variant that retrieves, per step, only the minimal system-prompt fragments and the smallest necessary subset of tools. ITR composes a dynamic runtime system prompt and exposes a narrowed toolset with confidence-gated fallbacks. Using a controlled benchmark with internally consistent numbers, ITR reduces per-step context tokens by 95%, improves correct tool routing by 32% relative, and cuts end-to-end episode cost by 70% versus a monolithic baseline. These savings enable agents to run 2-20x more loops within context limits. Savings compound with the number of agent steps, making ITR particularly valuable for long-running autonomous agents. We detail the method, evaluation protocol, ablations, and operational guidance for practical deployment.

Uria Franko• 2025

Related benchmarks

Task	Dataset	Result	Rank
Multi-hop Reasoning	Multi-hop reasoning tasks T2 L ≈ 9 steps	API Success Rate79		4
Tool selection	All Tasks	Tools Correct82		4

Showing 2 of 2 rows

Other info

Follow for update

@wizwand_team Discord