Deep Biaffine Attention for Neural Dependency Parsing

About

This paper builds off recent work from Kiperwasser & Goldberg (2016) using neural attention in a simple graph-based dependency parser. We use a larger but more thoroughly regularized parser than other recent BiLSTM-based approaches, with biaffine classifiers to predict arcs and labels. Our parser gets state of the art or near state of the art performance on standard treebanks for six different languages, achieving 95.7% UAS and 94.1% LAS on the most popular English PTB dataset. This makes it the highest-performing graph-based parser on this benchmark---outperforming Kiperwasser Goldberg (2016) by 1.8% and 2.2%---and comparable to the highest performing transition-based parser (Kuncoro et al., 2016), which achieves 95.8% UAS and 94.6% LAS. We also show which hyperparameter choices had a significant effect on parsing accuracy, allowing us to achieve large gains over other graph-based approaches.

Timothy Dozat, Christopher D. Manning• 2016

Related benchmarks

Task	Dataset	Result
Dependency Parsing	Chinese Treebank (CTB) (test)	UAS90.43	99
Dependency Parsing	Penn Treebank (PTB) (test)	LAS94.1	80
Dependency Parsing	English PTB Stanford Dependencies (test)	UAS95.84	76
Dependency Parsing	WSJ (test)	UAS95.74	67
Semantic Parsing	CFQ (MCD1)	Accuracy42.7	33
Semantic Parsing	CFQ MCD3	Accuracy11.62	33
Semantic Parsing	CFQ (MCD2)	Accuracy9.46	33
Dependency Parsing	PTB	LAS95.36	31
Dependency Parsing	UD 2.2 (test)	bg90.3	31
Dependency Parsing	CoNLL German 2009 (test)	UAS94.52	25

Showing 10 of 69 rows

Other info

Follow for update

@wizwand_team Discord