Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Improving Coherence and Persistence in Agentic AI for System Optimization

About

Designing high-performance system heuristics is a creative, iterative process requiring experts to form hypotheses and execute multi-step conceptual shifts. While Large Language Models (LLMs) show promise in automating this loop, they struggle with complex system problems due to two critical failure modes: evolutionary neighborhood bias and the coherence ceiling. Evolutionary methods often remain trapped in local optima by relying on scalar benchmark scores, failing when coordinated multi-step changes are required. Conversely, existing agentic frameworks suffer from context degradation over long horizons or fail to accumulate knowledge across independent runs. We present Engram, an agentic researcher architecture that addresses these limitations by decoupling long-horizon exploration from the constraints of a single context window. Engram organizes exploration into a sequence of agents that iteratively design, test, and analyze mechanisms. At the conclusion of each run, an agent stores code snapshots, logs, and results in a persistent Archive and distills high-level modeling insights into a compact, persistent Research Digest. Subsequent agents then begin with a fresh context window, reading the Research Digest to build on prior discoveries. We find that Engram exhibits superior performance across diverse domains including multi-cloud multicast, LLM inference request routing, and optimizing KV cache reuse in databases with natural language queries.

Pantea Karimi, Kimia Noorbakhsh, Mohammad Alizadeh, Hari Balakrishnan• 2026

Related benchmarks

TaskDatasetResultRank
Transaction schedulingADRS TXN
Best3.92e+3
12
Congestion Based Loss OptimizationADRS CBL
CBL Average Best Score103.6
3
Estimated Packet Loss Budget OptimizationADRS EPLB
Average Best EPLB Score27.3
3
Network Policy Monitoring (Prism)ADRS Prism
Prism Average Best Score27.94
3
Telemetry Data OptimizationADRS Telemetry
Telemetry (Average Best Score)95.4
3
Multi-cloud Congestion Based Loss OptimizationADRS CBL-Multi
Average Best Score (CBL-Multi)79.9
3
System OptimizationTelemetry
Median Normalized AUC96.5
2
System OptimizationEPLB
Median Normalized AUC87
2
System OptimizationPRISM
Median Normalized AUC90.3
2
System OptimizationTXN
Median Normalized AUC83
2
Showing 10 of 14 rows

Other info

Follow for update