Universal Dependency Parsing from Scratch
About
This paper describes Stanford's system at the CoNLL 2018 UD Shared Task. We introduce a complete neural pipeline system that takes raw text as input, and performs all tasks required by the shared task, ranging from tokenization and sentence segmentation, to POS tagging and dependency parsing. Our single system submission achieved very competitive performance on big treebanks. Moreover, after fixing an unfortunate bug, our corrected system would have placed the 2nd, 1st, and 3rd on the official evaluation metrics LAS,MLAS, and BLEX, and would have outperformed all submission systems on low-resource treebank categories on all metrics by a large margin. We further show the effectiveness of different model components through extensive ablation studies.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Dependency Parsing | CoNLL UD Shared Task big treebanks 2018 2.0 (dev) | LAS (Labeled Attachment Score)83.03 | 8 | |
| Dependency Parsing | Universal Dependencies Low-Res treebanks (test) | -- | 8 | |
| Universal Dependency Parsing | CoNLL Shared Task all treebanks 2018 (test) | UPOS Accuracy89.95 | 5 | |
| Dependency Parsing | Universal Dependencies (UD) Small 3,000 token target treebank 1.0 (test) | -- | 3 | |
| Dependency Parsing | Universal Dependencies PUD treebanks (test) | LAS (F1)82.25 | 2 | |
| Universal Dependency Parsing | CoNLL Shared Task Big treebanks 2018 (test) | Token Accuracy99.43 | 2 |