Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

AFlow: Automating Agentic Workflow Generation

About

Large language models (LLMs) have demonstrated remarkable potential in solving complex tasks across diverse domains, typically by employing agentic workflows that follow detailed instructions and operational sequences. However, constructing these workflows requires significant human effort, limiting scalability and generalizability. Recent research has sought to automate the generation and optimization of these workflows, but existing methods still rely on initial manual setup and fall short of achieving fully automated and effective workflow generation. To address this challenge, we reformulate workflow optimization as a search problem over code-represented workflows, where LLM-invoking nodes are connected by edges. We introduce AFlow, an automated framework that efficiently explores this space using Monte Carlo Tree Search, iteratively refining workflows through code modification, tree-structured experience, and execution feedback. Empirical evaluations across six benchmark datasets demonstrate AFlow's efficacy, yielding a 5.7% average improvement over state-of-the-art baselines. Furthermore, AFlow enables smaller models to outperform GPT-4o on specific tasks at 4.55% of its inference cost in dollars. The code is available at https://github.com/FoundationAgents/AFlow.

Jiayi Zhang, Jinyu Xiang, Zhaoyang Yu, Fengwei Teng, Xionghui Chen, Jiaqi Chen, Mingchen Zhuge, Xin Cheng, Sirui Hong, Jinlin Wang, Bingnan Zheng, Bang Liu, Yuyu Luo, Chenglin Wu• 2024

Related benchmarks

TaskDatasetResultRank
Mathematical ReasoningGSM8K--
1398
Code GenerationHumanEval
Pass@197.78
1043
Mathematical ReasoningGSM8K (test)
Accuracy96.5
954
Mathematical ReasoningMATH--
882
Language UnderstandingMMLU
Accuracy69.31
844
Mathematical ReasoningGSM8K (test)
Accuracy94.91
816
ReasoningBBH
Accuracy76
726
Code GenerationHumanEval (test)
Pass@194.2
612
Mathematical ReasoningMATH 500
Accuracy (Acc)77
543
Mathematical ReasoningMATH
Accuracy70.31
535
Showing 10 of 257 rows
...

Other info

Follow for update