PLR: Plackett-Luce for Reordering In-Context Learning Examples

About

In-context learning (ICL) adapts large language models by conditioning on a small set of ICL examples, avoiding costly parameter updates. Among other factors, performance is often highly sensitive to the ordering of the examples. However, exhaustive search over the $n!$ possible orderings is infeasible. Therefore more efficient ordering methods use model confidence measures (e.g., label-probability entropy) over label sets or take a direct approach to finding the best ordering. We propose PLR, a probabilistic approach to in-context example ordering that replaces discrete ordering search with learning a probability distribution over orderings with the Plackett-Luce model. PLR models orderings using a Plackett-Luce distribution and iteratively updates its parameters to concentrate probability mass on high-performing orderings under a task-level metric. Candidate orderings are sampled efficiently via a Gumbel perturb-and-sort procedure. Experiments on multiple classification benchmarks show that PLR consistently improves few-shot accuracy for $k \in \{4, 8, 16, 32\}$ examples, and we further demonstrate gains on mathematical reasoning tasks where label-based ordering methods are not applicable. Our code is available at https://github.com/Batorskq/PLR.

Pawel Batorski, Paul Swoboda• 2026

Related benchmarks

Task	Dataset	Result
Mathematical Reasoning	GSM8K (test)	Accuracy42.85	954
Text Classification	MR (test)	Accuracy93.13	155
Subjectivity Classification	Subj (test)	Accuracy93.59	152
Text Classification	TREC (test)	Accuracy70.63	122
Text Classification	SST-5 (test)	Accuracy55.96	60
Mathematical Reasoning	DeepMath	Accuracy46.36	37
Text Classification	News (test)	Accuracy86.31	2

Showing 7 of 7 rows

Other info

Follow for update

@wizwand_team Discord