A Structured Self-attentive Sentence Embedding
About
This paper proposes a new model for extracting an interpretable sentence embedding by introducing self-attention. Instead of using a vector, we use a 2-D matrix to represent the embedding, with each row of the matrix attending on a different part of the sentence. We also propose a self-attention mechanism and a special regularization term for the model. As a side effect, the embedding comes with an easy way of visualizing what specific parts of the sentence are encoded into the embedding. We evaluate our model on 3 different tasks: author profiling, sentiment classification, and textual entailment. Results show that our model yields a significant performance gain compared to other sentence embedding methods in all of the 3 tasks.
Zhouhan Lin, Minwei Feng, Cicero Nogueira dos Santos, Mo Yu, Bing Xiang, Bowen Zhou, Yoshua Bengio• 2017
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Natural Language Inference | SNLI (test) | Accuracy84.2 | 681 | |
| Text Classification | AG News (test) | -- | 210 | |
| Text Classification | SST-2 (test) | Accuracy86.4 | 185 | |
| Sentiment Classification | IMDB (test) | Error Rate6.79 | 144 | |
| Sentiment Classification | MR (test) | Accuracy81.7 | 142 | |
| Text Classification | Yahoo! Answers (test) | -- | 133 | |
| Text Classification | TREC (test) | -- | 113 | |
| Text Classification | IMDB (test) | CA43.3 | 79 | |
| Review Sentiment Classification | Yelp 2014 (test) | Accuracy61.5 | 41 | |
| Text Classification | DBPedia (test) | Test Error Rate0.007 | 40 |
Showing 10 of 18 rows