Data-to-Text Generation with Content Selection and Planning
About
Recent advances in data-to-text generation have led to the use of large-scale datasets and neural network models which are trained end-to-end, without explicitly modeling what to say and in what order. In this work, we present a neural network architecture which incorporates content selection and planning without sacrificing end-to-end training. We decompose the generation task into two stages. Given a corpus of data records (paired with descriptive documents), we first generate a content plan highlighting which information should be mentioned and in which order and then generate the document while taking the content plan into account. Automatic and human-based evaluation experiments show that our model outperforms strong baselines improving the state-of-the-art on the recently released RotoWire dataset.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Data-to-text generation | MLB (test) | RG Precision81.3 | 22 | |
| Data-to-text generation | RotoWire (test) | Factual Support Score4.9 | 19 | |
| Data-to-text generation | ROTOWIRE (dev) | RG Score0.3388 | 12 | |
| Knowledge Selection | RotoWire-FG | Relation Generation P94.21 | 10 | |
| Data-to-text generation | MLB (dev) | RG Score17.7 | 4 |