A Decomposable Attention Model for Natural Language Inference
About
We propose a simple neural architecture for natural language inference. Our approach uses attention to decompose the problem into subproblems that can be solved separately, thus making it trivially parallelizable. On the Stanford Natural Language Inference (SNLI) dataset, we obtain state-of-the-art results with almost an order of magnitude fewer parameters than previous work and without relying on any word-order information. Adding intra-sentence attention that takes a minimum amount of order into account yields further improvements.
Ankur P. Parikh, Oscar T\"ackstr\"om, Dipanjan Das, Jakob Uszkoreit• 2016
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Natural Language Inference | SNLI (test) | Accuracy86.8 | 681 | |
| Natural Language Inference | SNLI | Accuracy86.3 | 174 | |
| Natural Language Inference | SNLI (train) | Accuracy90.5 | 154 | |
| Natural Language Inference | SciTail (test) | Accuracy72.3 | 86 | |
| Paraphrase Identification | Quora Question Pairs (test) | Accuracy87.77 | 72 | |
| Question Answering | Natural Question (NQ) (dev) | -- | 72 | |
| Paraphrase Identification | Quora Question Pairs (dev) | Accuracy87.8 | 14 | |
| Dialogue Disentanglement | Ubuntu IRC (dev) | VI0.874 | 9 | |
| Commonsense Reasoning | HSWAG Out-of-Domain (test) | Accuracy32.48 | 8 | |
| Commonsense Reasoning | SWAG In-Domain (test) | Accuracy46.8 | 8 |
Showing 10 of 16 rows