Semi-supervised Convolutional Neural Networks for Text Categorization via Region Embedding
About
This paper presents a new semi-supervised framework with convolutional neural networks (CNNs) for text categorization. Unlike the previous approaches that rely on word embeddings, our method learns embeddings of small text regions from unlabeled data for integration into a supervised CNN. The proposed scheme for embedding learning is based on the idea of two-view semi-supervised learning, which is intended to be useful for the task of interest even though the training is done on unlabeled data. Our models achieve better results than previous approaches on sentiment classification and topic classification tasks.
Rie Johnson, Tong Zhang• 2015
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Text Classification | AG News (test) | -- | 210 | |
| Sentiment Classification | IMDB (test) | Error Rate0.0651 | 144 | |
| Topic Classification | DBPedia (test) | -- | 64 | |
| Text Categorization | RCV1 (test) | Error Rate0.0797 | 24 | |
| Text Categorization | Elec (test) | Error Rate5.87 | 16 | |
| Sentiment Classification | Elec | Error Rate6.27 | 15 | |
| Binary Sentiment Classification | ACL-IMDB (test) | Error Rate7.67 | 12 | |
| Fine-grained Sentiment Classification | IMDB (test) | Error Rate (%)38.15 | 9 | |
| Topic Classification | arXiv (test) | Error Rate (%)35.89 | 6 | |
| Binary Sentiment Classification | Elec (test) | Error Rate (%)7.14 | 5 |
Showing 10 of 11 rows