Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

In-Place Feedback: Reliable Refinement for Multi-Turn Expert-LLM Collaboration

About

LLM-generated drafts often contain subtle factual or logical errors, yet prior work shows that models struggle to reliably integrate multi-turn feedback aimed at fixing them. We propose in-place feedback, an interaction paradigm in which the user directly edits the model's previous response and the model continues generation from the edited context. In-place feedback consistently outperforms standard multi-turn feedback across five reasoning-intensive benchmarks while requiring fewer tokens, and our fine-grained analysis shows that it applies corrections more reliably and propagates them to subsequent reasoning. A user study with domain experts refining LLM-generated summaries corroborates these findings: participants report higher final-output satisfaction and substantially lower fatigue with in-place feedback, and a mixed strategy combining in-place and multi-turn feedback scores highest on every measured dimension. These results suggest that editing errors directly is a more effective paradigm for expert-LLM collaboration.

Youngbin Choi, Minjong Lee, Saemi Moon, Seunghyuk Cho, Chaehyeon Chung, MoonJeong Park, Dongwoo Kim• 2025

Related benchmarks

TaskDatasetResultRank
ReasoningMMLU-Pro
Accuracy85.4
241
Mathematical ReasoningMATH Hard
Accuracy92.8
198
Graduate-level Science ReasoningGPQA
Accuracy69
121
Knowledge ReasoningMMLU-Pro
Accuracy83.4
120
Logical reasoningZebraLogic (test)
Grid Accuracy92.2
90
Logical reasoningZebraLogic v1.0 (test)
Cell Accuracy97.7
90
Code GenerationLiveCodeBench
Accuracy88.6
84
Science ReasoningGPQA
Accuracy (GPQA)58.7
72
Showing 8 of 8 rows

Other info

Follow for update