Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

A La Carte Embedding: Cheap but Effective Induction of Semantic Feature Vectors

About

Motivations like domain adaptation, transfer learning, and feature learning have fueled interest in inducing embeddings for rare or unseen words, n-grams, synsets, and other textual features. This paper introduces a la carte embedding, a simple and general alternative to the usual word2vec-based approaches for building such representations that is based upon recent theoretical results for GloVe-like embeddings. Our method relies mainly on a linear transformation that is efficiently learnable using pretrained word vectors and linear regression. This transform is applicable on the fly in the future when a new text feature or rare word is encountered, even if only a single usage example is available. We introduce a new dataset showing how the a la carte method requires fewer examples of words in context to learn high-quality embeddings and we obtain state-of-the-art results on a nonce task and some unsupervised document classification tasks.

Mikhail Khodak, Nikunj Saunshi, Yingyu Liang, Tengyu Ma, Brandon Stewart, Sanjeev Arora• 2018

Related benchmarks

TaskDatasetResultRank
Subjectivity ClassificationSubj
Accuracy93.8
266
Text ClassificationTREC
Accuracy89
179
Sentiment ClassificationCR
Accuracy84.3
142
Text ClassificationIMDB
Accuracy90.9
107
Text ClassificationMR
Accuracy81.8
93
Text ClassificationSST binary
Accuracy86.7
29
Text ClassificationMPQA
Accuracy87.6
25
Few-shot embedding inductionChimera 1.0 (test)
Spearman Correlation0.3941
15
Text ClassificationSST fine-grained
Accuracy48.1
10
Word Sense DisambiguationSemEval-2013 Task 12 (nouns)--
7
Showing 10 of 12 rows

Other info

Code

Follow for update