Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Promptbreeder: Self-Referential Self-Improvement Via Prompt Evolution

About

Popular prompt strategies like Chain-of-Thought Prompting can dramatically improve the reasoning abilities of Large Language Models (LLMs) in various domains. However, such hand-crafted prompt-strategies are often sub-optimal. In this paper, we present Promptbreeder, a general-purpose self-referential self-improvement mechanism that evolves and adapts prompts for a given domain. Driven by an LLM, Promptbreeder mutates a population of task-prompts, and subsequently evaluates them for fitness on a training set. Crucially, the mutation of these task-prompts is governed by mutation-prompts that the LLM generates and improves throughout evolution in a self-referential way. That is, Promptbreeder is not just improving task-prompts, but it is also improving the mutationprompts that improve these task-prompts. Promptbreeder outperforms state-of-the-art prompt strategies such as Chain-of-Thought and Plan-and-Solve Prompting on commonly used arithmetic and commonsense reasoning benchmarks. Furthermore, Promptbreeder is able to evolve intricate task-prompts for the challenging problem of hate speech classification.

Chrisantha Fernando, Dylan Banarse, Henryk Michalewski, Simon Osindero, Tim Rockt\"aschel• 2023

Related benchmarks

TaskDatasetResultRank
Mathematical ReasoningGSM8K (test)
Accuracy91.97
900
Code GenerationHumanEval (test)--
506
Code GenerationMBPP (test)--
298
Mathematical Problem SolvingGaokao MathQA
Accuracy76.6
60
Question AnsweringGPQA (test)
Accuracy40.9
55
Question AnsweringHotpotQA (test)--
37
Tool LearningRestBench TMDB
Success Rate74.1
32
Knowledge IntensiveGaokao History
Accuracy81.5
30
Function CallingBFCL Single-Turn
Accuracy81.3
22
Function CallingBFCL Multi-turn
Accuracy36.2
22
Showing 10 of 31 rows

Other info

Follow for update