Natural Language Processing (almost) from Scratch
About
We propose a unified neural network architecture and learning algorithm that can be applied to various natural language processing tasks including: part-of-speech tagging, chunking, named entity recognition, and semantic role labeling. This versatility is achieved by trying to avoid task-specific engineering and therefore disregarding a lot of prior knowledge. Instead of exploiting man-made input features carefully optimized for each task, our system learns internal representations on the basis of vast amounts of mostly unlabeled training data. This work is then used as a basis for building a freely available tagging system with good performance and minimal computational requirements.
Ronan Collobert, Jason Weston, Leon Bottou, Michael Karlen, Koray Kavukcuoglu, Pavel Kuksa• 2011
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Named Entity Recognition | CoNLL 2003 (test) | F1 Score89.59 | 539 | |
| Named Entity Recognition | CoNLL English 2003 (test) | F1 Score89.59 | 135 | |
| Chunking | CoNLL 2000 (test) | F1 Score94.32 | 88 | |
| Named Entity Recognition | Conll 2003 | F1 Score89.59 | 86 | |
| Part-of-Speech Tagging | Penn Treebank (test) | Accuracy97.29 | 64 | |
| Part-of-Speech Tagging | WSJ (test) | Accuracy97.29 | 51 | |
| Named Entity Recognition | CoNLL (test) | -- | 28 | |
| POS Tagging | PTB (test) | Accuracy97.29 | 24 | |
| Part-of-Speech Tagging | Penn Treebank POS (test) | F1 Score97.29 | 10 | |
| Chunking | CoNLL 2000 | F1 Score94.32 | 10 |
Showing 10 of 14 rows