In-Place Feedback: Reliable Refinement for Multi-Turn Expert-LLM Collaboration

About

LLM-generated drafts often contain subtle factual or logical errors, yet prior work shows that models struggle to reliably integrate multi-turn feedback aimed at fixing them. We propose in-place feedback, an interaction paradigm in which the user directly edits the model's previous response and the model continues generation from the edited context. In-place feedback consistently outperforms standard multi-turn feedback across five reasoning-intensive benchmarks while requiring fewer tokens, and our fine-grained analysis shows that it applies corrections more reliably and propagates them to subsequent reasoning. A user study with domain experts refining LLM-generated summaries corroborates these findings: participants report higher final-output satisfaction and substantially lower fatigue with in-place feedback, and a mixed strategy combining in-place and multi-turn feedback scores highest on every measured dimension. These results suggest that editing errors directly is a more effective paradigm for expert-LLM collaboration.

Youngbin Choi, Minjong Lee, Saemi Moon, Seunghyuk Cho, Chaehyeon Chung, MoonJeong Park, Dongwoo Kim• 2025

Related benchmarks

Task	Dataset	Result
Reasoning	MMLU-Pro	Accuracy85.4	264
Mathematical Reasoning	MATH Hard	Accuracy92.8	208
Knowledge Reasoning	MMLU-Pro	Accuracy83.4	148
Graduate-level Science Reasoning	GPQA	Accuracy69	138
Logical reasoning	ZebraLogic (test)	Grid Accuracy92.2	90
Logical reasoning	ZebraLogic v1.0 (test)	Cell Accuracy97.7	90
Code Generation	LiveCodeBench	Accuracy88.6	84
Science Reasoning	GPQA	Accuracy (GPQA)58.7	72

Showing 8 of 8 rows

Other info

Follow for update

@wizwand_team Discord