Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

STaR-GATE: Teaching Language Models to Ask Clarifying Questions

About

When prompting language models to complete a task, users often leave important aspects unsaid. While asking questions could resolve this ambiguity (GATE; Li et al., 2023), models often struggle to ask good questions. We explore a language model's ability to self-improve (STaR; Zelikman et al., 2022) by rewarding the model for generating useful questions-a simple method we dub STaR-GATE. We generate a synthetic dataset of 25,500 unique persona-task prompts to simulate conversations between a pretrained language model-the Questioner-and a Roleplayer whose preferences are unknown to the Questioner. By asking questions, the Questioner elicits preferences from the Roleplayer. The Questioner is iteratively finetuned on questions that increase the probability of high-quality responses to the task, which are generated by an Oracle with access to the Roleplayer's latent preferences. After two iterations of self-improvement, the Questioner asks better questions, allowing it to generate responses that are preferred over responses from the initial model on 72% of tasks. Our results indicate that teaching a language model to ask better questions leads to better personalized responses.

Chinmaya Andukuri, Jan-Philipp Fr\"anken, Tobias Gerstenberg, Noah D. Goodman• 2024

Related benchmarks

TaskDatasetResultRank
Science Question AnsweringScienceQA (test)
Average Accuracy90.05
245
Scientific Question AnsweringSciQA
Accuracy81.86
35
Image StylingRegular Dataset
Overall Score82.9
29
Reasoning Question AnsweringARC
Accuracy69.9
21
Image StylingComplex
Overall Score84.46
16
Conversational SQLCoSQL
Accuracy61.67
14
Image StylingSimple Dataset
Overall Score79.62
12
Image StylingRegular Text (test)
Overall Score77.62
8
Image StylingRegular Dataset (test)
Overall Score84.46
8
Image Quality EvaluationRegular Dataset Complex Vision-4B
Overall Score84.78
8
Showing 10 of 13 rows

Other info

Follow for update