Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

PRL: Prompts from Reinforcement Learning

About

Effective prompt engineering remains a central challenge in fully harnessing the capabilities of LLMs. While well-designed prompts can dramatically enhance performance, crafting them typically demands expert intuition and a nuanced understanding of the task. Moreover, the most impactful prompts often hinge on subtle semantic cues, ones that may elude human perception but are crucial for guiding LLM behavior. In this paper, we introduce PRL (Prompts from Reinforcement Learning), a novel RL-based approach for automatic prompt generation. Unlike previous methods, PRL can produce novel few-shot examples that were not seen during training. Our approach achieves state-of-the-art performance across a range of benchmarks, including text classification, simplification, and summarization. On the classification task, it surpasses prior methods by 2.58% over APE and 1.00% over EvoPrompt. Additionally, it improves the average ROUGE scores on the summarization task by 4.32 over APE and by 2.12 over EvoPrompt and the SARI score on simplification by 6.93 over APE and by 6.01 over EvoPrompt. Our code is available at https://github.com/Batorskq/prl .

Pawe{\l} Batorski, Adrian Kosmala, Paul Swoboda• 2025

Related benchmarks

TaskDatasetResultRank
Mathematical ReasoningGSM8K
Accuracy86.15
499
Mathematical ReasoningMATH 500
Accuracy44.4
442
Text ClassificationAG News (test)
Accuracy84.36
228
Text ClassificationTREC
Accuracy77.07
207
Text ClassificationSST-2 (test)
Accuracy96.32
185
Medical Question AnsweringMedQA
Accuracy53.34
153
Text ClassificationMR (test)
Accuracy91.27
148
Subjectivity ClassificationSubj (test)
Accuracy76.9
127
Text ClassificationTREC (test)
Accuracy77.07
115
Text ClassificationMR
Accuracy91.27
106
Showing 10 of 19 rows

Other info

Follow for update