Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Simple Hardware-Efficient PCFGs with Independent Left and Right Productions

About

Scaling dense PCFGs to thousands of nonterminals via a low-rank parameterization of the rule probability tensor has been shown to be beneficial for unsupervised parsing. However, PCFGs scaled this way still perform poorly as a language model, and even underperform similarly-sized HMMs. This work introduces \emph{SimplePCFG}, a simple PCFG formalism with independent left and right productions. Despite imposing a stronger independence assumption than the low-rank approach, we find that this formalism scales more effectively both as a language model and as an unsupervised parser. As an unsupervised parser, our simple PCFG obtains an average F1 of 65.1 on the English PTB, and as a language model, it obtains a perplexity of 119.0, outperforming similarly-sized low-rank PCFGs. We further introduce \emph{FlashInside}, a hardware IO-aware implementation of the inside algorithm for efficiently scaling simple PCFGs.

Wei Liu, Songlin Yang, Yoon Kim, Kewei Tu• 2023

Related benchmarks

TaskDatasetResultRank
Unsupervised ParsingPTB (test)--
75
Unsupervised Constituency ParsingChinese Treebank (CTB) (test)
Unlabeled Sentence F1 (Mean)42.9
36
Showing 2 of 2 rows

Other info

Follow for update