Long Short-Term Memory-Networks for Machine Reading
About
In this paper we address the question of how to render sequence-level networks better at handling structured input. We propose a machine reading simulator which processes text incrementally from left to right and performs shallow reasoning with memory and attention. The reader extends the Long Short-Term Memory architecture with a memory network in place of a single memory cell. This enables adaptive memory usage during recurrence with neural attention, offering a way to weakly induce relations among tokens. The system is initially designed to process a single sequence but we also demonstrate how to integrate it with an encoder-decoder architecture. Experiments on language modeling, sentiment analysis, and natural language inference show that our model matches or outperforms the state of the art.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Natural Language Inference | SNLI (test) | Accuracy86.3 | 681 | |
| Language Modeling | Penn Treebank (test) | Perplexity102 | 411 | |
| Sentiment Analysis | SST-5 (test) | Accuracy47.9 | 173 | |
| Natural Language Inference | SNLI (train) | Accuracy89.5 | 154 | |
| Text Classification | SST-2 | Accuracy87.3 | 121 | |
| Sentiment Classification | Stanford Sentiment Treebank SST-2 (test) | Accuracy87 | 99 | |
| Text Classification | SST-1 | Accuracy49.3 | 45 |