Semi-supervised Sequence Learning

About

We present two approaches that use unlabeled data to improve sequence learning with recurrent networks. The first approach is to predict what comes next in a sequence, which is a conventional language model in natural language processing. The second approach is to use a sequence autoencoder, which reads the input sequence into a vector and predicts the input sequence again. These two algorithms can be used as a "pretraining" step for a later supervised sequence learning algorithm. In other words, the parameters obtained from the unsupervised step can be used as a starting point for other supervised training models. In our experiments, we find that long short term memory recurrent networks after being pretrained with the two approaches are more stable and generalize better. With pretraining, we are able to train long short term memory recurrent networks up to a few hundred timesteps, thereby achieving strong performance in many text classification tasks, such as IMDB, DBpedia and 20 Newsgroups.

Andrew M. Dai, Quoc V. Le• 2015

Related benchmarks

Task	Dataset	Result
Sentiment Analysis	IMDB (test)	Accuracy92.8	306
Sentiment Classification	IMDB (test)	Error Rate7.24	144
Text Classification	Yahoo! Answers (test)	Clean Accuracy65.6	133
Text Classification	Yelp (test)	Accuracy57.7	100
Topic Classification	DBPedia (test)	--	64
Text Categorization	RCV1 (test)	Error Rate0.1465	24
Text Categorization	Elec (test)	Error Rate6.84	16
Binary Sentiment Classification	ACL-IMDB (test)	Error Rate7.24	12
Sentiment Classification	Rotten Tomatoes (test)	Test Error Rate16.7	8
Character-level Classification	DBpedia character-level (test)	Test Error Rate1.5	7

Showing 10 of 11 rows

Other info

Follow for update

@wizwand_team Discord