Noisy Channel Language Model Prompting for Few-Shot Text Classification

About

We introduce a noisy channel approach for language model prompting in few-shot text classification. Instead of computing the likelihood of the label given the input (referred as direct models), channel models compute the conditional probability of the input given the label, and are thereby required to explain every word in the input. We use channel models for recently proposed few-shot learning methods with no or very limited updates to the language model parameters, via either in-context demonstration or prompt tuning. Our experiments show that, for both methods, channel models significantly outperform their direct counterparts, which we attribute to their stability, i.e., lower variance and higher worst-case accuracy. We also present extensive ablations that provide recommendations for when to use channel prompt tuning instead of other competitive methods (e.g., direct head tuning): channel prompt tuning is preferred when the number of training examples is small, labels in the training data are imbalanced, or generalization to unseen labels is required.

Sewon Min, Mike Lewis, Hannaneh Hajishirzi, Luke Zettlemoyer• 2021

Related benchmarks

Task	Dataset	Result
Question Answering	ARC Challenge	--	906
Physical Commonsense Reasoning	PIQA	Accuracy47.08	696
Question Answering	OpenBookQA	Accuracy38.57	465
Sentence Completion	HellaSwag	Accuracy20.82	364
Word Sense Disambiguation	WiC	--	261
Common Sense Reasoning	COPA	Accuracy50.13	256
Sentiment Classification	SST-2	Accuracy85.2	190
Common Sense Reasoning	WinoGrande	Accuracy50.99	189
Story completion	StoryCloze	Accuracy57.84	80
Coreference Resolution	WSC	F1 Score46.38	7

Showing 10 of 15 rows

Other info

Follow for update

@wizwand_team Discord