Unsupervised Neural Text Simplification
About
The paper presents a first attempt towards unsupervised neural text simplification that relies only on unlabeled text corpora. The core framework is composed of a shared encoder and a pair of attentional-decoders and gains knowledge of simplification through discrimination based-losses and denoising. The framework is trained using unlabeled text collected from en-Wikipedia dump. Our analysis (both quantitative and qualitative involving human evaluators) on a public test data shows that the proposed model can perform text-simplification at both lexical and syntactic levels, competitive to existing supervised methods. Addition of a few labelled pairs also improves the performance further.
Sai Surya, Abhijit Mishra, Anirban Laha, Parag Jain, Karthik Sankaranarayanan• 2018
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Sentence Simplification | TurkCorpus English (test) | SARI36.29 | 41 | |
| Sentence Simplification | ASSET English (test) | SARI35.19 | 37 | |
| Text Simplification | WikiLarge (test) | SARI37.2 | 27 | |
| Text Simplification | Wikipedia-SimpleWikipedia (test) | FE-diff10.45 | 9 |
Showing 4 of 4 rows