Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Distilling Word Embeddings: An Encoding Approach

About

Distilling knowledge from a well-trained cumbersome network to a small one has recently become a new research topic, as lightweight neural networks with high performance are particularly in need in various resource-restricted systems. This paper addresses the problem of distilling word embeddings for NLP tasks. We propose an encoding approach to distill task-specific knowledge from a set of high-dimensional embeddings, which can reduce model complexity by a large margin as well as retain high accuracy, showing a good compromise between efficiency and performance. Experiments in two tasks reveal the phenomenon that distilling knowledge from cumbersome embeddings is better than directly training neural networks with small embeddings.

Lili Mou, Ran Jia, Yan Xu, Ge Li, Lu Zhang, Zhi Jin• 2015

Related benchmarks

TaskDatasetResultRank
Subjectivity ClassificationSubj
Accuracy90.64
343
Question ClassificationTREC
Accuracy90.6
262
Opinion Polarity DetectionMPQA
Accuracy88.65
158
Sentiment ClassificationMR
Accuracy77.11
148
Sentiment ClassificationCR
Accuracy80.88
142
Sentiment ClassificationStanford Sentiment Treebank SST-2 (test)
Accuracy83.71
105
Sentence ClassificationStanford Sentiment Treebank (SST) fine-grained (test)
Accuracy44.94
40
Showing 7 of 7 rows

Other info

Follow for update