Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Have Your Text and Use It Too! End-to-End Neural Data-to-Text Generation with Semantic Fidelity

About

End-to-end neural data-to-text (D2T) generation has recently emerged as an alternative to pipeline-based architectures. However, it has faced challenges in generalizing to new domains and generating semantically consistent text. In this work, we present DataTuner, a neural, end-to-end data-to-text generation system that makes minimal assumptions about the data representation and the target domain. We take a two-stage generation-reranking approach, combining a fine-tuned language model with a semantic fidelity classifier. Each of our components is learnt end-to-end without the need for dataset-specific heuristics, entity delexicalization, or post-processing. We show that DataTuner achieves state of the art results on the automated metrics across four major D2T datasets (LDC2017T10, WebNLG, ViGGO, and Cleaned E2E), with a fluency assessed by human annotators nearing or exceeding the human-written reference texts. We further demonstrate that the model-based semantic fidelity scorer in DataTuner is a better assessment tool compared to traditional, heuristic-based measures. Our generated text has a significantly better semantic fidelity than the state of the art across all four datasets

Hamza Harkous, Isabel Groves, Amir Saffari• 2020

Related benchmarks

TaskDatasetResultRank
AMR-to-text generationLDC2017T10 (test)
BLEU37.7
55
Data-to-text generationWebNLG (test)
BLEU61.44
39
Graph-to-text generationWebNLG all v1.0 (test)
BLEU52.9
11
Data-to-text generationCleaned E2E (test)
BLEU43.6
9
Data-to-text generationWebNLG--
8
Data-to-text generationE2E Cleaned
Fluency5.46
5
Data-to-text generationVIGGO
Fluency5.77
5
Data-to-text generationLDC2017T10
Fluency Score4.87
5
Graph-to-text generationAMR (Human Evaluation)
Fluency5.78
5
Graph-to-text generationWebNLG (Human Evaluation)
Fluency5.74
5
Showing 10 of 12 rows

Other info

Code

Follow for update