Selecting Better Samples from Pre-trained LLMs: A Case Study on Question Generation

About

Large Language Models (LLMs) have in recent years demonstrated impressive prowess in natural language generation. A common practice to improve generation diversity is to sample multiple outputs from the model. However, there lacks a simple and robust way of selecting the best output from these stochastic samples. As a case study framed in the context of question generation, we propose two prompt-based approaches to selecting high-quality questions from a set of LLM-generated candidates. Our method works under the constraints of 1) a black-box (non-modifiable) question generation model and 2) lack of access to human-annotated references -- both of which are realistic limitations for real-world deployment of LLMs. With automatic as well as human evaluations, we empirically demonstrate that our approach can effectively select questions of higher qualities than greedy generation.

Xingdi Yuan, Tong Wang, Yen-Hsiang Wang, Emery Fine, Rania Abdelghani, Pauline Lucas, H\'el\`ene Sauz\'eon, Pierre-Yves Oudeyer• 2022

Related benchmarks

Task	Dataset	Result
Question Generation	SQuAD	BLEU-40.401	21
Question Generation	Fairytale QA	ROUGE-L43.9	17
Question Selection	Fairytale QA (test)	Grammatical Correctness97.5	14
Question Selection	SQuAD	Grammatical Correctness0.983	14

Showing 4 of 4 rows

Other info

Follow for update

@wizwand_team Discord