Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

UPA: Unsupervised Prompt Agent via Tree-Based Search and Selection

About

Prompt agents have recently emerged as a promising paradigm for automated prompt optimization, framing prompt discovery as a sequential decision-making problem over a structured prompt space. While this formulation enables the use of advanced planning algorithms, these methods typically assume access to supervised reward signals, which are often unavailable in practical scenarios. In this work, we propose UPA, an Unsupervised Prompt Agent that realizes structured search and selection without relying on ground-truth (GT) rewards. Specifically, during search, UPA iteratively constructs an evolving tree structure to navigate the prompt space, guided by fine-grained and position-debiased pairwise comparisons from Large Language Models (LLMs). Crucially, as these local comparisons do not inherently yield a consistent global scale, we decouple systematic prompt exploration from final selection, introducing a two-stage framework grounded in the Bradley-Terry-Luce (BTL) model. This framework first performs path-wise Bayesian aggregation of local comparisons to filter candidates under uncertainty, followed by global tournament-style comparisons to infer latent prompt quality and identify the optimal prompt. Experiments across multiple tasks demonstrate that UPA consistently outperforms existing prompt optimization methods, showing that agent-style optimization can remain highly effective even in unsupervised settings.

Siran Peng, Weisong Zhao, Tianyu Fu, Chenxu Zhao, Tianshuo Zhang, Haoyuan Zhang, Xiangyu Zhu, Minghui Wu, Zhen Lei• 2026

Related benchmarks

TaskDatasetResultRank
Question AnsweringGPQA
Accuracy84.2
258
Logical reasoningBBH
Accuracy100
249
Coreference ResolutionWSC
Accuracy98.5
116
Mathematical ReasoningAGIEval MATH
Accuracy95.7
99
Question AnsweringGPQA (test)
Accuracy45.5
65
Mathematical ReasoningAGIEval-MATH (test)
Accuracy52.1
31
Coreference ResolutionWSC (test)
Accuracy82.7
19
Fact CheckingLIAR
Accuracy78.8
12
Fact CheckingLIAR (test)
Accuracy68.2
11
Navigation ReasoningBBH-Navigate (test)
Accuracy98
11
Showing 10 of 11 rows

Other info

Follow for update