CEC-Zero: Zero-Supervision Character Error Correction with Self-Generated Rewards

About

Large-scale Chinese spelling correction (CSC) remains critical for real-world text processing, yet existing LLMs and supervised methods lack robustness to novel errors and rely on costly annotations. We introduce CEC-Zero, a zero-supervision reinforcement learning framework that addresses this by enabling LLMs to correct their own mistakes. CEC-Zero synthesizes errorful inputs from clean text, computes cluster-consensus rewards via semantic similarity and candidate agreement, and optimizes the policy with PPO. It outperforms supervised baselines by 10--13 F$_1$ points and strong LLM fine-tunes by 5--8 points across 9 benchmarks, with theoretical guarantees of unbiased rewards and convergence. CEC-Zero establishes a label-free paradigm for robust, scalable CSC, unlocking LLM potential in noisy text pipelines.

Zhiming Lin, Kai Zhao, Sophie Zhang, Peilai Yu, Canran Xiao• 2025

Related benchmarks

Task	Dataset	Result
Chinese Spelling Correction	CSCD-NS	Sentence Correction F1 Score79.71	35
Chinese Spelling Check	LEMON	CAR63.28	21
Chinese Spelling Check	CS	Sentence-level F191.78	21
Chinese Spelling Check	LEMON, CSCD-NS, and CS Combined	Average Error65.14	21

Showing 4 of 4 rows

Other info

Follow for update

@wizwand_team Discord