Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

FlowSteer: Towards Agents Designing Agentic Workflows via Reinforced Progressive Canvas Editing

About

In recent years, agentic workflows have been widely applied to solve complex human tasks. However, existing workflow construction still faces key challenges, including human-dependent workflow construction, the lack of graph-level execution feedback, and the inability to repair errors in-loop during long-horizon construction. To address these challenges, we propose FlowSteer, a new paradigm of Agent Designing Agentic Workflows - a single agent itself end-to-end designs the workflow that a downstream executor runs. To support this paradigm, we introduce the Workflow Canvas, a novel executable graph-state environment that returns syntax-checked execution feedback for every atomic edit. Built on the canvas, we further propose Reinforced Progressive Canvas Editing, in which a lightweight policy agent issues one atomic edit per turn conditioned on real canvas feedback, and is trained end-to-end via reinforcement learning. Moreover, FlowSteer provides a plug-and-play framework that supports diverse operator libraries and interchangeable LLM backends. Experimental results on twelve datasets show that FlowSteer significantly outperforms baselines across various tasks. Our code is available at https://anonymous.4open.science/r/FlowSteer-9B2E.

Mingda Zhang, Wenjin Liu, Tiesunlong Shen, Qika Lin, Rui Mao, Erik Cambria, Xiaoying Tang, Haoran Luo• 2026

Related benchmarks

TaskDatasetResultRank
Code GenerationHumanEval
Pass@192.96
1043
Mathematical ReasoningMATH
Accuracy81.25
535
Mathematical ReasoningMathQA
Accuracy88.67
354
Mathematical ReasoningAIME 2025
Accuracy26.67
227
Question AnsweringSQuAD 2.0
F183.67
215
Question AnsweringHotpotQA
F184.98
132
Code GenerationAPPS
Pass@149.21
111
Question AnsweringTriviaQA
F184.11
46
Question AnsweringNaturalQuestions
F162.56
42
Code GenerationHumanEval OOD
Pass@193.75
39
Showing 10 of 24 rows

Other info

Follow for update