Right at My Level: A Unified Multilingual Framework for Proficiency-Aware Text Simplification

About

Text simplification supports second language (L2) learning by providing comprehensible input, consistent with the Input Hypothesis. However, constructing personalized parallel corpora is costly, while existing large language model (LLM)-based readability control methods rely on pre-labeled sentence corpora and primarily target English. We propose Re-RIGHT, a unified reinforcement learning framework for adaptive multilingual text simplification without parallel corpus supervision. We first show that prompting-based lexical simplification at target proficiency levels (CEFR, JLPT, TOPIK, and HSK) performs poorly at easier levels and for non-English languages, even with state-of-the-art LLMs such as GPT-5.2 and Gemini 2.5. To address this, we collect 43K vocabulary-level data across four languages (English, Japanese, Korean, and Chinese) and train a compact 4B policy model using Re-RIGHT, which integrates three reward modules: vocabulary coverage, semantic preservation, and coherence. Compared to the stronger LLM baselines, Re-RIGHT achieves higher lexical coverage at target proficiency levels while maintaining original meaning and fluency.

Jinhong Jeong, Junghun Park, Youngjae Yu• 2026

Related benchmarks

Task	Dataset	Result
Text Simplification	Wikipedia English Total (test)	Vocabulary Coverage81.6	6
Text Simplification	Wikipedia English Easy (test)	Vocabulary Coverage66.9	6
Text Simplification	Wikipedia Japanese Total (test)	Vocabulary Coverage76	6
Text Simplification	Wikipedia Japanese Easy (test)	Vocabulary Coverage60.4	6
Text Simplification	Wikipedia Korean Total (test)	Vocabulary Coverage70.4	6
Text Simplification	Wikipedia Korean Easy (test)	Vocabulary Coverage52.9	6
Text Simplification	Wikipedia Chinese Total (test)	Vocabulary Coverage80.2	6
Text Simplification	Wikipedia Chinese Easy (test)	Vocabulary Coverage66.1	6

Showing 8 of 8 rows

Other info

Follow for update

@wizwand_team Discord