Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

R2-Write: Reflection and Revision for Open-Ended Writing with Deep Reasoning

About

While deep reasoning with long chain-of-thought has dramatically improved large language models in verifiable domains like mathematics, its effectiveness for open-ended tasks such as writing remains unexplored. In this paper, we conduct a systematic investigation revealing that existing mainstream reasoning models achieve limited gains on open-ended writing tasks. Our further analysis shows that these models lack deep reflection and revision patterns in open-ended writing, resulting in substantially smaller improvements compared to mathematical reasoning tasks. To address this limitation, we introduce R2-Write: an automated framework that synthesizes high-quality thinking trajectories enriched with explicit reflection and revision patterns through iterative writer-judge interaction. To prevent redundant reflections, we design a process reward mechanism that supervises reflection quality during reinforcement learning, improving both performance and token efficiency. Extensive experiments across multiple creative writing and deep-research benchmarks demonstrate significant improvements, validating that explicitly incorporating reflection and revision patterns unlocks deep reasoning capabilities for open-ended writing tasks.

Wanlong Liu, Bo Zhang, Chenliang Li, Shaopeng Lai, Yuning Wu, Xuanyu Lei, Ming Yan• 2026

Related benchmarks

TaskDatasetResultRank
WritingWritingBench
Score83.8
58
Discourse-level Chinese-English translationDiscoX
Accuracy23.5
19
Professional deep-research writingDeepresearch-Gym
KPR72.5
19
Open-ended writingHelloBench
Average Score82
11
Open-ended writingDeepResearchBench
Overall Score46.93
11
Creative WritingHelloBench--
6
Mathematical ReasoningMATH 500--
6
Showing 7 of 7 rows

Other info

Follow for update