Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Towards Better UD Parsing: Deep Contextualized Word Embeddings, Ensemble, and Treebank Concatenation

About

This paper describes our system (HIT-SCIR) submitted to the CoNLL 2018 shared task on Multilingual Parsing from Raw Text to Universal Dependencies. We base our submission on Stanford's winning system for the CoNLL 2017 shared task and make two effective extensions: 1) incorporating deep contextualized word embeddings into both the part of speech tagger and parser; 2) ensembling parsers trained with different initialization. We also explore different ways of concatenating treebanks for further improvements. Experimental results on the development data show the effectiveness of our methods. In the final evaluation, our system was ranked first according to LAS (75.84%) and outperformed the other systems by a large margin.

Wanxiang Che, Yijia Liu, Yuxuan Wang, Bo Zheng, Ting Liu• 2018

Related benchmarks

TaskDatasetResultRank
Dependency ParsingCoNLL UD Shared Task big treebanks 2018 2.0 (dev)
LAS (Labeled Attachment Score)84.37
8
Dependency Parsingzh gsd CoNLL 2018 Shared Task (test)
LAS76.77
5
Universal Dependency ParsingCoNLL Shared Task Big treebanks 2018 (test)
Token Accuracy99.51
2
Dependency Parsingaf afribooms CoNLL 2018 Shared Task (test)
LAS85.47
1
Dependency Parsingar padt CoNLL 2018 Shared Task (test)
LAS73.63
1
Dependency Parsingbg btb CoNLL 2018 Shared Task (test)
LAS91.22
1
Dependency Parsingbr keb CoNLL 2018 Shared Task (test)
LAS8.54
1
Dependency Parsingen ewt CoNLL 2018 Shared Task (test)
LAS84.57
1
Dependency Parsingja gsd CoNLL 2018 Shared Task (test)
LAS83.11
1
Dependency Parsingth pud CoNLL 2018 Shared Task (test)
LAS64
1
Showing 10 of 10 rows

Other info

Code

Follow for update