Skip-Thought Vectors
About
We describe an approach for unsupervised learning of a generic, distributed sentence encoder. Using the continuity of text from books, we train an encoder-decoder model that tries to reconstruct the surrounding sentences of an encoded passage. Sentences that share semantic and syntactic properties are thus mapped to similar vector representations. We next introduce a simple vocabulary expansion method to encode words that were not seen as part of training, allowing us to expand our vocabulary to a million words. After training our model, we extract and evaluate our vectors with linear models on 8 tasks: semantic relatedness, paraphrase detection, image-sentence ranking, question-type classification and 4 benchmark sentiment and subjectivity datasets. The end result is an off-the-shelf encoder that can produce highly generic sentence representations that are robust and perform well in practice. We will make our encoder publicly available.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Natural Language Inference | SNLI (test) | Accuracy87.7 | 681 | |
| Subjectivity Classification | Subj | Accuracy94.2 | 266 | |
| Question Classification | TREC | Accuracy92.2 | 205 | |
| Text Classification | TREC | Accuracy93 | 179 | |
| Opinion Polarity Detection | MPQA | Accuracy89.3 | 154 | |
| Sentiment Classification | MR | Accuracy76.5 | 148 | |
| Sentiment Classification | IMDB (test) | Error Rate17.42 | 144 | |
| Sentiment Classification | CR | Accuracy83.8 | 142 | |
| Subjectivity Classification | Subj (test) | Accuracy93.6 | 125 | |
| Question Classification | TREC (test) | Accuracy92.2 | 124 |