ZERA: Zero-init Instruction Evolving Refinement Agent -- From Zero Instructions to Structured Prompts via Principle-based Optimization

About

Automatic Prompt Optimization (APO) improves large language model (LLM) performance by refining prompts for specific tasks. However, prior APO methods typically focus only on user prompts, rely on unstructured feedback, and require large sample sizes and long iteration cycles-making them costly and brittle. We propose ZERA (Zero-init Instruction Evolving Refinement Agent), a novel framework that jointly optimizes both system and user prompts through principled, low-overhead refinement. ZERA scores prompts using eight generalizable criteria with automatically inferred weights, and revises prompts based on these structured critiques. This enables fast convergence to high-quality prompts using minimal examples and short iteration cycles. We evaluate ZERA across five LLMs and nine diverse datasets spanning reasoning, summarization, and code generation tasks. Experimental results demonstrate consistent improvements over strong baselines. Further ablation studies highlight the contribution of each component to more effective prompt construction. Our implementation including all prompts is publicly available at https://github.com/younatics/zera-agent.

Seungyoun Yi, Minsoo Khang, Sungrae Park• 2025

Related benchmarks

Task	Dataset	Result
Math Reasoning	GSM8K (test)	Accuracy89.98	250
Mathematical Reasoning	GSM-Hard	--	162
Mathematical Reasoning	GSM8K (val)	--	108
Math Reasoning	MultiArith (test)	Accuracy99.59	54
Mathematical Reasoning	AQuA-RAT (test)	Accuracy81.36	40
Math Reasoning	GSM-Hard (test)	Accuracy53.22	30
Mathematical Reasoning	AQUA (val)	Tokens at Best Step (K)529	7
Mathematical Reasoning	MultiArith (val)	Tokens at Best Step (K)460	7

Showing 8 of 8 rows

Other info

Follow for update

@wizwand_team Discord