Establishing Strong Baselines for the New Decade: Sequence Tagging, Syntactic and Semantic Parsing with BERT

About

This paper presents new state-of-the-art models for three tasks, part-of-speech tagging, syntactic parsing, and semantic parsing, using the cutting-edge contextualized embedding framework known as BERT. For each task, we first replicate and simplify the current state-of-the-art approach to enhance its model efficiency. We then evaluate our simplified approaches on those three tasks using token embeddings generated by BERT. 12 datasets in both English and Chinese are used for our experiments. The BERT models outperform the previously best-performing models by 2.5% on average (7.5% for the most significant case). Moreover, an in-depth analysis on the impact of BERT embeddings is provided using self-attention, which helps understanding in this rich yet representation. All models and source codes are available in public so that researchers can improve upon and utilize them to establish strong baselines for the next decade.

Han He, Jinho D. Choi• 2019

Related benchmarks

Task	Dataset	Result
Semantic Dependency Parsing	SemEval Task 18 2015 (WSJ ID)	--	17
Semantic Dependency Parsing	SemEval SDP DM OOD 2015	F1 Score90.8	7
Semantic Dependency Parsing	SemEval SDP PSD 2015 (ID)	F1 Score86.8	6
Semantic Dependency Parsing	SemEval SDP PAS OOD 2015	F1 (PAS)94.4	6
Semantic Dependency Parsing	SemEval SDP PSD OOD 2015	F1 Score79.5	6

Showing 5 of 5 rows

Other info

Follow for update

@wizwand_team Discord