Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Noisy Channel Language Model Prompting for Few-Shot Text Classification

About

We introduce a noisy channel approach for language model prompting in few-shot text classification. Instead of computing the likelihood of the label given the input (referred as direct models), channel models compute the conditional probability of the input given the label, and are thereby required to explain every word in the input. We use channel models for recently proposed few-shot learning methods with no or very limited updates to the language model parameters, via either in-context demonstration or prompt tuning. Our experiments show that, for both methods, channel models significantly outperform their direct counterparts, which we attribute to their stability, i.e., lower variance and higher worst-case accuracy. We also present extensive ablations that provide recommendations for when to use channel prompt tuning instead of other competitive methods (e.g., direct head tuning): channel prompt tuning is preferred when the number of training examples is small, labels in the training data are imbalanced, or generalization to unseen labels is required.

Sewon Min, Mike Lewis, Hannaneh Hajishirzi, Luke Zettlemoyer• 2021

Related benchmarks

TaskDatasetResultRank
Question AnsweringARC Challenge--
749
Question AnsweringOpenBookQA
Accuracy38.57
465
Physical Commonsense ReasoningPIQA
Accuracy47.08
329
Sentiment ClassificationSST-2
Accuracy85.2
174
Common Sense ReasoningWinoGrande
Accuracy50.99
156
Common Sense ReasoningCOPA
Accuracy50.13
138
Sentence CompletionHellaSwag
Accuracy20.82
133
Word Sense DisambiguationWiC--
84
Story completionStoryCloze
Accuracy57.84
65
Coreference ResolutionWSC
F1 Score46.38
7
Showing 10 of 15 rows

Other info

Follow for update